Architecture

bfev orchestrates a 4-stage GHG pipeline through five Claude Code agents: an orchestrator + four stage sub-agents. Agents communicate only by reading and writing files in a known layout. There is no in-memory hand-off.

Pipeline overview — four stage agents, the orchestrator audits artefacts from above

Why sub-agents

The pipeline used to run inside a single agent loading every skill. That context bloated quickly: stage 4 alone carries ~10 feedback rules (palette, anti-leak, no-knowbox, no-unsourced-entities, …) and those rules had to be loaded into stages 1–3 for no reason.

Splitting into sub-agents means each stage loads only:

its own skill / SKILL.md
the contract (input schema + output schema)
the rules that apply to its output

The cross-stage discipline that used to depend on a shared context is now enforced mechanically by the orchestrator's audit step (see Contracts → orchestrator).

Stages

#	Agent	Reads	Writes
1	`agent:collection`	`client.yaml`, project PDF	`collection/schema.xlsx`, `collection/request.pdf`
2	`agent:simulate`	`collection/schema.xlsx`, source PDF, `client.yaml`	`activity-data-<s>/filled.xlsx`, `provenance.yaml`
3	`agent:calculate`	`filled.xlsx`, `client.yaml`, `categories.json`	`calculations-<s>/aggregates.json`, `results.json`, `crosscheck.xlsx`, `meta.yaml`, `audit.log`
4	`agent:report`	`calculations-<s>/` (read-only), `client.yaml`	`deliverables-<s>/{executive,scientific,official}.pdf`, `self-audit.json`

Single source of truth

Sources of truth — client.yaml and aggregates.json, read by every stage

Two files anchor everything:

client.yaml — written once by the user, read by every stage. Names, period, lineaire, allowed actors. Immutable per scenario.
calculations-<s>/aggregates.json — written by stage 3, read by stage 4 and the orchestrator. Every KPI in any delivered PDF must be grep-findable here.

The orchestrator never reads filled.xlsx or stage internals. It only reads client.yaml, aggregates.json, and the rendered PDFs, then audits.

The crosscheck workbook

Crosscheck workbook — live formulas reconciled against aggregates.json

Stage 3 emits crosscheck.xlsx next to aggregates.json. The workbook has one row per activity line with columns for activity data, EF, GWP, uncertainty and live cell formulas (=AD*EF*GWP) that recompute emissions. A human auditor can open the file, change any input, and watch totals update.

The orchestrator's audit step recomputes the workbook with openpyxl and asserts every scope total in aggregates.json equals the recomputed sum within uncertainty_tolerance_pct (default 0.5 %). If the agent ever cheats and writes a literal where a formula is expected, the contract post-condition catches it before the audit even runs.

Run logging

Every stage emits one JSONL line on completion:

{
  "ts": "2026-05-11T14:32:08Z",
  "run_id": "r-20260511T143208-a3f9",
  "agent": "calculate",
  "scenario": "v0",
  "model": "claude-opus-4-7",
  "inputs":  [{"path": "...filled.xlsx", "sha256": "..."}],
  "outputs": [{"path": "...aggregates.json", "sha256": "..."}],
  "tool_calls": 47,
  "duration_s": 184,
  "contract_check": "pass"
}

Two destinations:

clients/<slug>/runs.jsonl — per-client timeline, append-only.
$BFEV_HOME/.bfev/logs/<run-id>/stage-<n>-<agent>.jsonl — per-run trace.

sha256 of every artifact makes drift between runs detectable even when file names are stable.

Lockfile

clients/<slug>/lockfile.yaml pins:

bfev version
EF table content hashes
IPCC volume content hashes
skill versions

Without a lockfile, re-running the same client.yaml six months later would silently use updated factors. With one, the orchestrator refuses to start until the user either matches the lockfile or runs bfev upgrade <slug> explicitly.