De-risk spike (GO/NO-GO) — mine the git "why" through the existing faithfulness judge, report two numbers

task-codebase-why-mining-spike

task confidence inferred status backlog 2026-06-16 owner extraction-engineer
source log-auditor — filed from DEC-0040 as the immediate go/no-go gate the ratified decision is explicitly blocked on

De-risk spike — mine the git "why", report two numbers (GO / NO-GO)

Codebase ingestion as the 4th connector — a three-layer deterministic code-graph substrate + git-mined "why", gated on a de-risk spike and dogfooded on this repo first ratifies codebase ingestion as the fourth connector, but explicitly gates the build on this spike: no substrate infrastructure (tree-sitter symbol graph, git-history graph, GraphStore) is built until the value thesis is measured. The headline "why"-mining claim is testable independently of the tree-sitter substrate, so this de-risks the whole bet cheaply on machinery that already exists — the `@dossier/extraction` pipeline and the LLM-as-judge faithfulness judge.

The two numbers (the deliverable)

  1. Decision-recall vs. gold. Mine the git "why" from this repo (D:\github\dossier) — which has ~38 hand-authored decision records, a built-in gold eval set — and score how much of that captured rationale the mining recovers.
  2. Faithfulness-pass-rate + raw yield on a messy repo. Run the same mining over one messy real client repo and report the faithfulness-pass-rate (through the existing judge) and the raw decision-yield.

That number IS the value thesis. It answers whether real client git/PR hygiene clears the bar that makes mined decision atoms shippable.

The pipeline to exercise (no new infra)

git log -L → filter trivial commits → linked PRs/issues → LLM-explained rationale → OKF decision atoms (confidence: inferred; provenance = the commit / PR SHA). Every atom must clear the Live extraction eval harness — what we measure is what extraction optimizes for faithfulness floor or be DROPPED — never ship fabricated rationale (Dossier — The Knowledge Model (v0) principle 7). Reuse the existing pipeline + judge only; build no tree-sitter substrate and no GraphStore for the spike.

Why a task (and why p1, not p0)

It is the immediate next step for the code-ingestion thesis and the gate the whole build hangs on — but it sits behind the agentic board v1 review gate in the roadmap order (Codebase ingestion as the 4th connector — a three-layer deterministic code-graph substrate + git-mined "why", gated on a de-risk spike and dogfooded on this repo first sequencing: finish Agentic board v1 — build the git-resident OKF task board (deterministic offline core, SDK reserved), resolving DEC-0024's four open questions and dogfooding Dossier's own repo first → run this spike → build substrate v1). Owned by the Knowledge-Extraction & GraphRAG Engineer because it lives entirely in the extraction + eval layer. Provenance: filed by the log-auditor directly from the ratified DEC-0040 as the go/no-go gate the decision names; confidence: inferred (agent-filed from the decision, not human-curated). A no-go is a valid outcome — report the numbers either way.