bfev orchestrates a 4-stage GHG pipeline through five Claude Code agents:
an orchestrator + four stage sub-agents. Agents communicate only by
reading and writing files in a known layout. There is no in-memory hand-off.
The pipeline used to run inside a single agent loading every skill. That context bloated quickly: stage 4 alone carries ~10 feedback rules (palette, anti-leak, no-knowbox, no-unsourced-entities, …) and those rules had to be loaded into stages 1–3 for no reason.
Splitting into sub-agents means each stage loads only:
The cross-stage discipline that used to depend on a shared context is now enforced mechanically by the orchestrator's audit step (see Contracts → orchestrator).
| # | Agent | Reads | Writes |
|---|---|---|---|
| 1 | agent:collection |
client.yaml, project PDF |
collection/schema.xlsx, collection/request.pdf |
| 2 | agent:simulate |
collection/schema.xlsx, source PDF, client.yaml |
activity-data-<s>/filled.xlsx, provenance.yaml |
| 3 | agent:calculate |
filled.xlsx, client.yaml, categories.json |
calculations-<s>/aggregates.json, results.json, crosscheck.xlsx, meta.yaml, audit.log |
| 4 | agent:report |
calculations-<s>/ (read-only), client.yaml |
deliverables-<s>/{executive,scientific,official}.pdf, self-audit.json |
Two files anchor everything:
client.yaml — written once by the user, read by every stage. Names, period,
lineaire, allowed actors. Immutable per scenario.calculations-<s>/aggregates.json — written by stage 3, read by stage 4 and
the orchestrator. Every KPI in any delivered PDF must be grep-findable here.The orchestrator never reads filled.xlsx or stage internals. It only reads
client.yaml, aggregates.json, and the rendered PDFs, then audits.
Stage 3 emits crosscheck.xlsx next to aggregates.json. The workbook has
one row per activity line with columns for activity data, EF, GWP, uncertainty
and live cell formulas (=AD*EF*GWP) that recompute emissions. A
human auditor can open the file, change any input, and watch totals update.
The orchestrator's audit step recomputes the workbook with openpyxl and
asserts every scope total in aggregates.json equals the recomputed sum
within uncertainty_tolerance_pct (default 0.5 %). If the agent ever cheats
and writes a literal where a formula is expected, the contract post-condition
catches it before the audit even runs.
Every stage emits one JSONL line on completion:
{
"ts": "2026-05-11T14:32:08Z",
"run_id": "r-20260511T143208-a3f9",
"agent": "calculate",
"scenario": "v0",
"model": "claude-opus-4-7",
"inputs": [{"path": "...filled.xlsx", "sha256": "..."}],
"outputs": [{"path": "...aggregates.json", "sha256": "..."}],
"tool_calls": 47,
"duration_s": 184,
"contract_check": "pass"
}
Two destinations:
clients/<slug>/runs.jsonl — per-client timeline, append-only.$BFEV_HOME/.bfev/logs/<run-id>/stage-<n>-<agent>.jsonl — per-run trace.sha256 of every artifact makes drift between runs detectable even when file names are stable.
clients/<slug>/lockfile.yaml pins:
Without a lockfile, re-running the same client.yaml six months later would
silently use updated factors. With one, the orchestrator refuses to start
until the user either matches the lockfile or runs bfev upgrade <slug>
explicitly.