Do not scan/mirror Anthropic's docs into the KB — distill curated references instead
0050-do-not-mirror-anthropic-docs-distill-references
- Reversibility
- two-way door
DEC-0050 — Do not mirror Anthropic's docs into the KB; distill curated references instead
Reversibility: two-way door — a curation/sourcing policy, not built code; nothing is mirrored, so reversing means adding a scan if a real need ever appears (e.g. an offline/air-gapped build with no live doc access). The durable commitment is the principle — reference, don't absorb, for external docs we do not own and that churn fast — which generalizes the OKF upstream relationship — complement at the format layer, competitor at the serving layer stance to the Anthropic-docs case.
Context
On 2026-06-18 the Principal Forward Deployed Engineer verified that 0% of the Claude/Anthropic documentation is scanned or cached anywhere in this repo — what exists are ~8 hand-cited doc-page URLs inside ADRs and research atoms, never a captured corpus. The question this forces: should we ingest Anthropic's docs into the KB so agents stop re-fetching the same stable facts? This is the same shape as the OKF-upstream call (OKF upstream relationship — complement at the format layer, competitor at the serving layer): an external body of knowledge Dossier builds on but does not own.
The trigger was real recurring friction — agents kept re-reading the docs in-session to recall the same stable facts (which file a subagent lives in, the hook event names, the current Opus model ID). We needed a durable answer that lowers that recall cost without absorbing someone else's fast-moving corpus.
Options considered
- Scan/mirror the docs into the KB (crawl
code.claude.com/docs+platform.claude.com/docsinto OKF atoms or a cached corpus). Rejected: (a) provenance/sovereignty — the docs are Anthropic's, not the org's own institutional memory; the KB is for the org's compounding knowledge, not a mirror of a vendor's manual (the Dossier — The Knowledge Model (v0) "capture judgment, not just facts" + Dossier — Mission & North Star sovereignty stance). (b) Staleness — the Claude model/primitive surface moves fast (Fable 5, Opus 4.8 are recent); a frozen mirror rots and quietly misleads. (c) Redundant — we already read docs transiently in-session via live primitives, so a mirror is bloat + rot risk with no owned-knowledge gain. - Capture nothing — keep re-fetching in-session every time. Rejected: leaves the real recurring friction unaddressed; the stable core (the six primitives + their config locations, the current model family) genuinely is re-derived over and over and is worth capturing once.
- Distill one curated
referenceatom of the stable core + adopt "reference, don't absorb" as the standing rule (chosen). Capture only the high-churn-resistant facts a build-with-primitives shop keeps re-deriving — grounded in primary docs at capture time, dated, provenance-bearing, recency-caveated — and route everything live (a new model, a changed param, an edge case) to the live primitives. Lowers recall cost without absorbing the corpus, and matches the OKF upstream relationship — complement at the format layer, competitor at the serving layer evidence→reference, judgment→decisionprecedent.
Decision
Do NOT scan, crawl, or mirror Anthropic's Claude / Claude Code docs into the KB. Adopt the standing rule — reference, don't absorb — for high-churn external documentation Dossier builds on but does not own:
- Read the docs transiently in-session via the live primitives — the
claude-apiskill, theclaude-code-guideagent,WebFetchagainst the docs. These are always-current; the docs are their source of truth, not the KB. - Synthesize the judgment into an ADR when a doc fact forces a real choice (the existing pattern — DEC-0004 et al.).
- Capture only the stable, re-derived core as a dated, provenance-bearing
referenceatom, recency-caveated, explicitly framed as "implication, NOT a decision." The first such atom is Claude Primitives & Model Family — June 2026 Snapshot (the six primitives + config locations + the current Claude model family/pricing).
Rationale
- Sovereignty/provenance is the line. The KB is the org's owned, compounding memory. Anthropic's docs are Anthropic's — mirroring them in confuses whose knowledge this is and what the org can claim to own (exactly the distinction OKF upstream relationship — complement at the format layer, competitor at the serving layer draws: complement at the format/source layer, don't absorb).
- A frozen snapshot of a fast-moving surface rots. The model/primitive surface churns; a mirror would silently age. A small, dated, caveated distillation of only the stable core is honest about its own shelf life and points to the live source for anything that moves.
- The right pattern already exists. Transient read → synthesize ADR → distill the stable core is what we already do; this decision names it as policy rather than inventing machinery.
asserted, notverified. This is a curation/sourcing judgment (which knowledge the KB should and should not hold), not a mechanism-tested claim. The underlying facts that 0% is cached today and that the live primitives exist areverified; the policy on top isasserted.
Consequences
- No doc-scan/mirror tooling is built or wired; the KB stays free of a vendor doc corpus.
- Agents have a known baseline (Claude Primitives & Model Family — June 2026 Snapshot) for the stable core and a clear instruction to use the live primitives for anything current — lowering recall cost without a rotting mirror.
- The
referencelibrary gains a fourth domain (Claude primitives/models) under the same evidence→reference, judgment→decisiondiscipline. - Generalizable: the same "reference, don't absorb" rule applies to any future external doc set Dossier builds on but does not own.
Review
Promote asserted → verified only if the policy is stress-tested by a real counter-need and holds — e.g. an offline/air-gapped build with no live doc access where re-deriving from the live primitives is impossible, forcing a deliberate, scoped cache. If that case ever lands, capture it as its own decision (it would amend, not silently override, this stance). Absent such a case, this stays an asserted curation policy.