Architecture

bfev orchestrates a 4-stage GHG pipeline through five Claude Code agents: an orchestrator + four stage sub-agents. Agents communicate only by reading and writing files in a known layout. There is no in-memory hand-off.

Pipeline overview — four stage agents, the orchestrator audits artefacts from above

Why sub-agents

The pipeline used to run inside a single agent loading every skill. That context bloated quickly: stage 4 alone carries ~10 feedback rules (palette, anti-leak, no-knowbox, no-unsourced-entities, …) and those rules had to be loaded into stages 1–3 for no reason.

Splitting into sub-agents means each stage loads only:

The cross-stage discipline that used to depend on a shared context is now enforced mechanically by the orchestrator's audit step (see Contracts → orchestrator).

Stages

# Agent Reads Writes
1 agent:collection client.yaml, project PDF collection/schema.xlsx, collection/request.pdf
2 agent:simulate collection/schema.xlsx, source PDF, client.yaml activity-data-<s>/filled.xlsx, provenance.yaml
3 agent:calculate filled.xlsx, client.yaml, categories.json calculations-<s>/aggregates.json, results.json, crosscheck.xlsx, meta.yaml, audit.log
4 agent:report calculations-<s>/ (read-only), client.yaml deliverables-<s>/{executive,scientific,official}.pdf, self-audit.json

Single source of truth

Sources of truth — client.yaml and aggregates.json, read by every stage

Two files anchor everything:

The orchestrator never reads filled.xlsx or stage internals. It only reads client.yaml, aggregates.json, and the rendered PDFs, then audits.

The crosscheck workbook

Crosscheck workbook — live formulas reconciled against aggregates.json

Stage 3 emits crosscheck.xlsx next to aggregates.json. The workbook has one row per activity line with columns for activity data, EF, GWP, uncertainty and live cell formulas (=AD*EF*GWP) that recompute emissions. A human auditor can open the file, change any input, and watch totals update.

The orchestrator's audit step recomputes the workbook with openpyxl and asserts every scope total in aggregates.json equals the recomputed sum within uncertainty_tolerance_pct (default 0.5 %). If the agent ever cheats and writes a literal where a formula is expected, the contract post-condition catches it before the audit even runs.

Run logging

Every stage emits one JSONL line on completion:

{
  "ts": "2026-05-11T14:32:08Z",
  "run_id": "r-20260511T143208-a3f9",
  "agent": "calculate",
  "scenario": "v0",
  "model": "claude-opus-4-7",
  "inputs":  [{"path": "...filled.xlsx", "sha256": "..."}],
  "outputs": [{"path": "...aggregates.json", "sha256": "..."}],
  "tool_calls": 47,
  "duration_s": 184,
  "contract_check": "pass"
}

Two destinations:

sha256 of every artifact makes drift between runs detectable even when file names are stable.

Lockfile

clients/<slug>/lockfile.yaml pins:

Without a lockfile, re-running the same client.yaml six months later would silently use updated factors. With one, the orchestrator refuses to start until the user either matches the lockfile or runs bfev upgrade <slug> explicitly.