First full-loop SERVE on a real external client — reconcile divergent extraction runs to one canonical KB on a quality rubric; lexical retrieval sufficient (VectorRetriever seam not yet needed)

0034-first-external-serve-and-reconciliation-method

decision read as Explain confidence verified status active 2026-06-16 owner forward-deployed-engineer

Reversibility: two-way door

DEC-0034 — First external SERVE; reconciliation method + lexical-sufficiency confirmation

Context

First time the Dossier loop ran end-to-end through the SERVE half on a real external client (RBA Consulting). Ingest + extract were done in prior sessions; three divergent OKF extraction runs already existed (48 / 19 / 16 atoms). The client data, the served KB, and the client-facing artifact are sovereign and gitignored under clients/ (they live in the client's own loop, not Dossier's knowledge/). This record captures only the build-side method that is reusable across clients — never the client specifics.

This is recorded as a light decision rather than a bare log line because two build-side learnings will be asked about again: how do we collapse divergent extraction runs? and why have we not built vector retrieval yet?

Options considered

Reconciling the three divergent extraction runs:

Merge all three (union of atoms) — maximizes volume, inherits every run's defects and duplications.
Pick the largest run by atom count — volume as a proxy for completeness.
(chosen) Pick on a QUALITY rubric, then repair the chosen run — select the variant with the strongest graph, not the most atoms.

Retrieval backing for SERVE:

Use the reserved injectable Embedder / VectorIndex seam now (build vector retrieval).
(chosen) Use deterministic lexical (BM25-lite) retrieval and measure whether it suffices before building the vector backend.

Decision

Reconcile divergent extraction runs to ONE canonical KB on a quality rubric — the chosen variant was the only one with a client graph anchor + a workflow, real typed edges (stages / uses), and full provenance. A bounded defect (workflow stage content left inline) was fixed by atomizing it into 5 proper process atoms, yielding a clean canonical KB (53 atoms, 143 typed edges, 0 load errors, 0 dangling edges, provenance complete). It was then served via the real @dossier/mcp server over JSON-RPC stdio (MCP agentic foundation — tenant-scoped GraphRAG over the OKF KB); tenant confinement was proven empirically (an id present only in another silo of the same client returns not_found). Across 8 real stakeholder questions, retrieval was strong on 6–7 of 8 using lexical (BM25-lite) retrieval alone — so the reserved VectorRetriever seam is NOT yet needed for a KB of this size.

Rationale

Quality over volume — the reusable rubric. The strongest KB is the one with the best graph, not the most atoms: a real client anchor, real typed edges, and full provenance beat a larger but anchorless, edge-poor extraction. GraphRAG's value (MCP agentic foundation — tenant-scoped GraphRAG over the OKF KB) is in the edges, so the reconciliation rubric must reward the graph, not the row count. This collapse-to-canonical-on-a-rubric method is the build-side IP worth keeping — distinct from The compounding merge — the per-tenant learning loop accumulates by id + confidence instead of overwriting (okf reconcile() + opt-in reconcile in extraction/runtime) (which merges new extractions into an existing canonical KB over time); this is the one-time act of choosing the canonical baseline from divergent first-pass runs.
Defects route to the right layer. Inline stage content was atomized into process atoms (an extraction-quality fix, not a serving fix). The two honest misses were a content gap (no pricing atoms — nothing was extracted to find) and KB redundancy (near-duplicate capability atoms) — both routed to the extraction layer for reconcile rather than papered over at serve time. Honest about what the loop did and did not capture; no fabricated completeness.
Evidence-based deferral, not a guess. The deferral of the live vector backend (MCP agentic foundation — tenant-scoped GraphRAG over the OKF KB reserved the Embedder / VectorIndex seam) is now confirmed by evidence: lexical retrieval answered every question that had matching content. The seam stays reserved; building it is justified only when a KB's recall actually degrades on lexical alone. This converts a design-time deferral into a measured one.
Sovereignty held — zero shipped-source changes for the serve. No shipped package source changed for this exercise; all client data + the client-facing brief stay gitignored under clients/ (Adopt OKF as Dossier's canonical knowledge format sovereignty; Fix git-per-tenant isolation when a tenant root is nested inside another repo one-client-one-repo isolation). The client-facing artifact was generated programmatically from the served KB with every claim provenance-cited.
verified. The reconciliation outcome (53 atoms / 143 edges / 0 errors / 0 dangling / full provenance), the live MCP serve over stdio, the empirical tenant-confinement check, and the lexical-sufficiency result are all observed facts from this session. Build green: tsc -b clean, vitest run 351 passed / 1 skipped.

Consequences

A reusable reconciliation method exists: when first-pass extraction yields divergent runs, choose the canonical baseline on the graph-quality rubric (anchor + typed edges + provenance), repair bounded defects by atomizing, and route content gaps / redundancy back to the extraction layer.
The VectorRetriever / Embedder seam stays reserved as an evidence-gated upgrade — built only when a real KB's lexical recall is shown insufficient, not preemptively.
The SERVE half of the loop is proven on a real external client, with tenant confinement empirically demonstrated and a provenance-cited client artifact produced — the first external close of the loop's serving end.
Known follow-ups routed (not silently dropped): the content gap (no pricing atoms) and the capability-atom redundancy go to extraction reconcile (The compounding merge — the per-tenant learning loop accumulates by id + confidence instead of overwriting (okf reconcile() + opt-in reconcile in extraction/runtime)); the narrowed-expansion gap for vertical edge labels is tracked in OKF edge vocabulary is registry-driven — a vertical declares its own traversable edges.

Review

Revisit the lexical-sufficiency call per-KB: a larger client KB, or one where stakeholders ask semantically-phrased questions that lexical misses, is the trigger to build the reserved vector backend. Revisit the reconciliation rubric if a future client's strongest-graph variant is also the lowest-volume by a margin that loses material content — at which point the merge path (The compounding merge — the per-tenant learning loop accumulates by id + confidence instead of overwriting (okf reconcile() + opt-in reconcile in extraction/runtime)) may need to run across variants rather than choosing one.

Provenance

Build-side method observed during a live session serving a real external client (RBA Consulting); client data, served KB, and client-facing artifact are gitignored under clients/ (sovereign — not in this repo). Outcome facts (53 atoms / 143 typed edges / 0 load errors / 0 dangling / provenance 53/53; live @dossier/mcp JSON-RPC stdio serve; empirical tenant not_found confinement; 6–7/8 strong on lexical retrieval) verified in-session. Build green: tsc -b clean, vitest run 351 passed / 1 skipped. No shipped package source changed for the serve.