Upgrade the log-auditor from a backward-looking recorder into a recorder + standing auditor that closes the improvement loop

0036-log-auditor-recorder-to-standing-auditor

decision read as Explain confidence asserted status active 2026-06-16 owner principal-architect
Reversibility
two-way door

DEC-0036 — The log-auditor becomes a recorder + standing auditor

Reversibility: two-way door — the agent prompt is a markdown file in .claude/agents/; revert is a git checkout. The durable intent — a finding with no durable home is a finding lost; route every finding to a record, a task, or an explicit flag — is what holds, and is mechanism-independent.

Context

The log-auditor is the curation half of Dossier's learning loop (Establish the learning-loop & audit architecture): the PostToolUse hook captures raw traces, the log-auditor distills them into atomic decision records and the running Dossier — Decision & Audit Log. Its instructions, however, predated the agentic board (Agentic "sprint board" architecture — a git-resident OKF task board worked by bounded, hook-governed Agent SDK loops / Agentic board v1 — build the git-resident OKF task board (deterministic offline core, SDK reserved), resolving DEC-0024's four open questions and dogfooding Dossier's own repo first). It had no awareness of knowledge/tasks/ (Dossier — Work Items (the agentic board)) as the place to file durable, owned follow-ups — so any improvement it surfaced had only one outlet: prose in its return message, which evaporates when the turn ends.

That gap bit in the immediately-prior turn. Recording Decouple the agentic board from DOCS_ENABLED — ship /board publicly in production while the /knowledge reading surface stays gated dark, the auditor correctly spotted that the codebase cited DEC-0021 for the landing-only DOCS_ENABLED gate when the real record is DEC-0022 (Ship the landing publicly behind a docs-gate flag, capture demand through two honest doors, into a list we own; DEC-0021 is Web ingestion — a keyless HttpConnector by default, Firecrawl wired as the premium path, and a first-class CLI web-ingest mode, web ingestion). A genuinely good catch — but its only recourse was to "route it to the Astro Starlight Engineer" as a sentence in its reply. That sentence had no durable home; the main agent had to catch and fix the stale references by hand. A correct finding with nowhere to live.

User feedback, verbatim: "idk if the log auditor needs more intelligence or support surfacing improvements but it needs an upgrade." Since every agent already inherits the session's Opus model (model: inherit), "more intelligence" was not a model lever — it was a method lever.

Options considered

1. What kind of upgrade — model vs. method.

  • (a) Raise the model / add tooling. Rejected: the agent already runs model: inherit (the session Opus), so there is no higher model to give it. The deficiency was not raw capability; it was the absence of a defined audit method and a durable outlet for findings.
  • (b) Sharpen the method (chosen). Give it an explicit forward audit pass, a verify-before-write rule, a board outlet for follow-ups, and a structured return contract. This addresses "more intelligence" as sharper procedure, not a bigger brain.

2. Where does a surfaced improvement go.

  • (a) Prose in the return message (status quo). Rejected — this is the exact failure that motivated the change: the finding evaporates.
  • (b) Fix everything in place. Rejected — many findings need another owner's judgment or non-trivial work; a recorder silently rewriting other layers' files is out of mandate and risks unverified edits.
  • (c) Three-way routing — fix-in-place hygiene vs. file-a-board-task vs. flag-only (chosen). True hygiene the auditor owns (a wrong [[id]], a missing log line) it fixes directly. Real follow-up needing another owner becomes an OKF task atom on the board (Dossier — Work Items (the agentic board)) with full provenance. Genuinely-human judgment calls are flagged in the return, severity-ranked. A "don't duplicate an open task; don't clutter the board" bar keeps the board signal-dense.

3. Scope of the rewrite.

  • (a) Bolt the board awareness onto the existing prompt. Considered, but the prompt also lacked an explicit forward audit pass, a verify-before-write rule, and a return contract — all of which the prior-turn failure exposed.
  • (b) Full replacement of .claude/agents/log-auditor.md (chosen). Rewrite to four jobs (decision records · running log · integrity/hygiene · surface improvements), with the audit pass, verify-before-write rule, three-way routing, and the Recorded/Found/Filed/Deferred return contract as first-class sections.

Decision

Fully rewrite .claude/agents/log-auditor.md so the agent is a recorder and a standing auditor. Four capabilities are added or made first-class:

  1. Surface improvements as board tasks. When a pass reveals follow-up work it should not fix in place, the auditor authors an OKF task atom at knowledge/tasks/task-<slug>.md conforming to the task type in Dossier — The Knowledge Model (v0)status: backlog, priority, owner/assignee = the responsible layer agent, acceptance_criteria, relates_to the atoms involved, and non-negotiable provenance (source: log-auditor — <how it surfaced>, confidence: inferred, timestamp today). It then logs the task in Task Board — Audit Log and references the task id from the record and its return — instead of narrating the finding into the void.
  2. Forward audit pass every run. Before calling a logging job done, scan the git diff/git log, conversation, and journal/traces/ for: unrecorded decisions, contradiction/duplication, link rot (and wrong cross-references), provenance/confidence gaps, and schema/conformance drift.
  3. Verify-before-write rule. Confirm every decision number and every cross-reference against the actual files before committing it (glob knowledge/decisions/, grep for the id). Never propagate a wrong or unverified link; a misattributed id is itself a finding to fix or file. The source citing an id is not trusted — it is checked.
  4. Structured return contract. End every run with Recorded / Found / Filed / Deferred, each citing file:line, so the caller can act rather than re-derive.

A three-way routing governs each finding (fix-in-place hygiene · file-a-task · flag-only), gated by "don't duplicate an open task; don't clutter the board."

This is an instruction/operating-model change only.claude/agents/log-auditor.md is the sole file changed; no code or runtime change. It extends the mandate of a team member set in Establish the expert/principal agent team and first skills, deepens the audit half of the loop in Establish the learning-loop & audit architecture, and makes Dossier — Work Items (the agentic board) / Agentic "sprint board" architecture — a git-resident OKF task board worked by bounded, hook-governed Agent SDK loops an output surface of the log-auditor (the board is now where the auditor's follow-ups durably live).

Rationale

  • A finding you only mention is a finding you've lost. The prior-turn DEC-0021/0022 catch proved the loop was open: the auditor could see drift but had no durable channel to act on it. Routing every finding to a record, a task, or an explicit flag closes that loop — the same "close the loop, don't narrate it" discipline the rest of Dossier runs on.
  • "More intelligence" was correctly read as method, not model. With model: inherit, raw capability was already maxed; the leverage was a defined audit pass, a verify-before-write guardrail, and a return contract — procedure that makes the inherited intelligence land.
  • It dogfoods the board on the auditor's own work. The board (Agentic "sprint board" architecture — a git-resident OKF task board worked by bounded, hook-governed Agent SDK loops) was built to be the durable home for owned follow-ups; wiring the auditor to file into it is Dossier using its own institutional-memory machinery — and it keeps provenance honest (source: log-auditor, confidence: inferred) so agent-surfaced work is never confused with human-curated work.
  • Verify-before-write protects the graph. A wrong cross-reference poisons every downstream reader (GraphRAG traverses these edges). Making "check the id against the files, don't trust the source" a hard rule is the single highest-leverage integrity guard for a recorder.
  • asserted, not verified. This is an operating-model change, exercised only in this very run. Its correctness is a matter of whether the new method actually produces clean records, files the right tasks, and skips the wrong ones in practice — which only repeated real runs will show. See Review.

Consequences

  • The board gains a new author. Dossier — Work Items (the agentic board) will now receive inferred, log-auditor-sourced backlog items. The provenance convention (source: log-auditor — …) keeps them distinguishable from human-curated tasks, and the "don't duplicate / don't clutter" bar guards board signal density. Brand-new task files pass the board claim-guard, so creating backlog is safe.
  • The auditor's return shape is now a contract. Callers can rely on Recorded/Found/Filed/Deferred and act on cited file:lines rather than re-deriving the state of the knowledge base.
  • The DEC-0035-routed source cleanup is already resolved — confirmed this run, NOT re-filed. DEC-0035 routed "correct the stale DEC-0021 references in board.astro to DEC-0022" to the starlight-engineer as prose. Verified against ground truth this run: packages/site/src/pages/board.astro now correctly cites DEC-0022 for the DOCS_ENABLED landing-only gate (line ~58), with DEC-0024/DEC-0026/DEC-0001 properly attributed throughout — the main agent already fixed it by hand. No board task was filed for it (verify-before-write + don't-clutter-the-board in action). DEC-0035's Consequences/Review still narrate it as an open routed cleanup; that prose is now stale — flagged, not silently rewritten, since amending a recorded decision's text is an owner judgment.
  • The candidate pattern extension is filed once. The same "surface-improvements → file a board task" pattern and structured findings return arguably belong on the other report-only agents (Adversarial QA & Knowledge-Integrity Reviewer, whose spec today says "reports findings, does not silently fix" with no durable home; possibly Principal Platform Architect). One scoping task is filed for that (Decide whether the report-only agents (qa-reviewer, principal-architect) should close the finding loop like the log-auditor now does, owner Product Owner) rather than editing those agents here.
  • Two-way door. The agent prompt is a markdown file; reverting is a git checkout. The durable intent (route every finding; never let a catch evaporate) is mechanism-independent and survives any later refinement of the exact routing.

Review

Promote toward verified after the first real audit pass that files a board task and returns the structured Recorded/Found/Filed/Deferred report cleanly — i.e. the new method demonstrably (a) catches drift, (b) routes each finding correctly (fixes the small thing, files the real follow-up, flags the human call), (c) does not file noise or duplicate an open task, and (d) never propagates an unverified link. This very run is the first exercise of the method; a second clean run on unrelated material would close the gate. Re-examine the "don't clutter the board" bar if log-auditor-sourced inferred tasks start to crowd human-curated backlog.