KB-agnostic @dossier/site (renders any tenant's OKF KB) + runtime-driven site rendering + the Node-26 Windows build fix

0026-kb-agnostic-site-and-runtime-rendering

decision read as Explain confidence asserted status active 2026-06-16 owner starlight-engineer
Reversibility
two-way door

DEC-0026 — KB-agnostic site + runtime site rendering + the Node-26 build fix

Reversibility: two-way door — the env-pointed KB source, the sidebar generator, and the external build invocation are all swappable/removable without touching the OKF repo or the runtime core; the durable parts are KB-agnostic-by-construction rendering, the runtime→site external-command seam, and sovereignty/zero-copy.

Context

Astro Starlight as the docs-site generator + the product-owner, starlight-engineer, and documentation-engineer functions chose Astro Starlight as the docs-site generator and stated its core mandate plainly: "the docs-site surface that renders a CLIENT's OKF knowledge base." Its Review update (2026-06-14) recorded that the surface had been dogfooded on Dossier's own knowledge/ repo, but the client-rendering half stayed pending — the surface hardwired Dossier's own ../../knowledge in three places and shipped a hardcoded Dossier sidebar (Mission / Decisions / References). It could render exactly one KB: ours.

Meanwhile the platform had grown a real way to produce a tenant's OKF KB end-to-end on the subscription: a freshly-provisioned tenant from the orchestration runtime (Runtime orchestration & per-tenant control plane — the learning loop becomes a runnable system), fed by keyless web ingestion (Web ingestion — a keyless HttpConnector by default, Firecrawl wired as the premium path, and a first-class CLI web-ingest mode) and subscription extraction (Subscription-backed extraction is a first-class transport — ClaudeCodeClient (no API keys)) — e.g. the RBA tenant KB (48 atoms). But nothing connected a produced tenant KB to the docs surface, and a local Windows build hit a hard crash on Node 26. This decision is the FDE's cross-layer work to close all three gaps at once. Verified, reproduced this session; committed in e5330fd on main (integrated alongside a concurrent landing/positioning workstream — Ship the landing publicly behind a docs-gate flag, capture demand through two honest doors, into a list we own through OKF upstream relationship — complement at the format layer, competitor at the serving layer — which this record does not cover).

Options considered

1. The docs surface — keep it Dossier-only vs. make it KB-agnostic.

  • (a) Keep @dossier/site hardwired to ../../knowledge with the Dossier sidebar. Rejected: it leaves Astro Starlight as the docs-site generator + the product-owner, starlight-engineer, and documentation-engineer functions's core mandate (render a client's KB) permanently unmet; the surface could only ever render ourselves.
  • (b) Make the surface KB-agnostic via a single env-selected source (chosen). A single DOSSIER_KB env selects the OKF repo, resolved once by knowledgeDir() in src/lib/okf-routes.mjs and shared by both the content-collection glob base (content.config.ts, via a file:// URL so an absolute path anywhere on disk loads) and the id→route map. The sidebar is generated from the KB's actual top-level subdirectories (generateSidebar() in astro.config.mjs) — one group per dir, KB-agnostic by construction rather than by a per-tenant config. Default (unset) stays byte-for-byte Dossier's own render, so the dogfood is unchanged.

2. How the runtime renders a tenant — hard-depend on Astro vs. invoke the site as an external command.

  • (a) Add @dossier/site / Astro as a runtime dependency and import the build. Rejected: it pulls the whole Astro/Vite toolchain into the runtime's dependency tree and couples the orchestration core to one renderer — against the surface-is-replaceable stance (Adopt OKF as Dossier's canonical knowledge format, Astro Starlight as the docs-site generator + the product-owner, starlight-engineer, and documentation-engineer functions).
  • (b) Invoke the @dossier/site build as an EXTERNAL command (chosen). siteForTenant / buildTenantSite (src/site.ts) resolve a provisioned tenant's OKF repo as the site's DOSSIER_KB and spawn the @dossier/site build as a separate process — the runtime never hard-depends on Astro, the same seam discipline as the MCP serve path (serve.js). siteForTenant is offline-unit-tested; buildTenantSite (which spawns the build) is exercised by real runs like the CLI.

3. The Node-26 Windows crash — downgrade/patch Astro vs. pre-clean the output dir.

  • (a) Pin/patch Astro or Node. Rejected as heavier than the defect: Astro 6.4.6's emptyDir itself is fine (fs.rmSync); only its Windows-only EPERM fallback fixWinEPERMSync is broken — it calls fs.rmdirSync(path, { recursive: true }), an option Node 26 removed — so a transiently-locked file in an existing dist/.vercel reaches the broken fallback and crashes. The Vercel/Linux deploy never hits it (isWindows is false there).
  • (b) Pre-remove the output dir so the broken fallback is never reached (chosen). packages/site/scripts/clean-out.mjs (wired into the site build script) removes the output dir before Astro runs, so emptyDir early-returns and never reaches fixWinEPERMSync — plus a clear "stop your dev/watch server" message if the dir is held. Build-side only.

Decision

Make @dossier/site KB-agnostic via a single DOSSIER_KB env, let @dossier/runtime render a provisioned tenant's KB by invoking the site build as an external command, and pre-clean the site output dir to dodge Astro's Node-26-broken Windows EPERM fallback.

  • KB-agnostic @dossier/site. DOSSIER_KB selects the OKF repo, resolved once by knowledgeDir() (src/lib/okf-routes.mjs) and shared by the glob base (content.config.ts, file:// URL → absolute paths anywhere on disk) and the id→route map. The sidebar comes from the KB's actual top-level subdirectories via generateSidebar() (astro.config.mjs) — one group per dir, KB-agnostic by construction. Unset = byte-for-byte Dossier's own render. This realizes Astro Starlight as the docs-site generator + the product-owner, starlight-engineer, and documentation-engineer functions's core mandate ("render a CLIENT's OKF knowledge base"), pending until now. Sovereignty intact (Adopt OKF as Dossier's canonical knowledge format): the site is a derived, read-only view; no KB file is copied or mutated.
  • Runtime site rendering (@dossier/runtime). siteForTenant / buildTenantSite (src/site.ts) + CLI dossier-runtime site [--build] and run --site. They resolve the tenant's OKF repo as the site's DOSSIER_KB and invoke the @dossier/site build as an external command — the runtime never hard-depends on Astro (same seam as the MCP serve path, serve.js). siteForTenant is offline-unit-tested; buildTenantSite (spawns the build) is exercised by real runs. So a freshly-extracted tenant can be rendered to a static docs site in one command.
  • Node-26 Windows build fix. packages/site/scripts/clean-out.mjs (wired into the site build script) pre-removes the output dir so Astro 6.4.6's emptyDir early-returns and never reaches the Windows-only fixWinEPERMSync fallback that calls the Node-26-removed fs.rmdirSync(path, { recursive: true }). Build-side only; the Vercel/Linux deploy never hit this.

Rationale

  • It meets DEC-0015's actual mandate, finally. Rendering a client's KB — not just ours — was the whole point of the docs surface; an env-selected source resolved once and a sidebar derived from the KB's own structure make it KB-agnostic by construction, not by hand-tuned per-tenant config. Keeping the unset default byte-for-byte identical preserves the dogfood (Astro Starlight as the docs-site generator + the product-owner, starlight-engineer, and documentation-engineer functions).
  • The external-command seam keeps the runtime renderer-agnostic. Invoking the site build as a separate process (never importing Astro) keeps the orchestration core free of the Astro/Vite toolchain and the surface swappable — the same discipline as the MCP serve path, and consistent with the surface-is-a-replaceable-view stance (Adopt OKF as Dossier's canonical knowledge format).
  • Sovereignty and provenance survive the multi-tenant render. Validated on the RBA tenant KB: concept-type badges, confidence, the live rbaconsulting.com source provenance, typed-edge "Related concepts" nav, and light+dark theming all held (screenshots under clients/rba/site-shots/) — the Web ingestion — a keyless HttpConnector by default, Firecrawl wired as the premium path, and a first-class CLI web-ingest mode URL provenance flows all the way through to the rendered page, read-only.
  • The build fix is minimal and targets the real defect. The bug is precisely Astro's Windows-only EPERM fallback hitting a removed Node 26 option; pre-cleaning the output dir so the fallback is never reached is strictly smaller than pinning Astro or Node, and the Linux/Vercel path is unaffected.
  • asserted, not verified. Built and verified green this session (below) and rendered one real client KB — but not multi-tenant or market validated, and the local Windows build can still be blocked by an environment lock (below). Design-level conviction backed by one real render, not corpus- or market-tested.

Consequences

  • The docs surface now renders any tenant's OKF KB. @dossier/site is KB-agnostic; a provisioned tenant (Runtime orchestration & per-tenant control plane — the learning loop becomes a runnable system) can be rendered to a static docs site in one runtime command. Validated end-to-end on the RBA tenant: 48 atoms → 50 pages, with badges / confidence / live-URL provenance / typed-edge nav / light+dark.
  • The runtime gains a site stage without an Astro dependency. dossier-runtime site [--build] and run --site exist; the runtime core stays renderer-agnostic via the external-command seam (same as the MCP serve path).
  • Windows local builds clear the Node-26 crash via clean-out.mjs. Build-side only — nothing added to the client-facing plugin subset.
  • Verification (reproduced this session). Repo-wide pnpm lint 0 errors; pnpm test 310 passed / 1 skipped; pnpm plugin:check in sync; the site builds clean into a fresh outDir (exit 0, ~35 pages — proving the Node-26 fix). Committed in e5330fd on main.
  • Honest open items.
    • (a) A local Windows pnpm build of @dossier/site can still be blocked by an editor/AV handle holding packages/site/dist — an environment lock, not code (CI/Linux + fresh-dir builds are clean; clean-out.mjs emits the "stop your dev/watch server" message).
    • (b) A KB whose atoms reference ids that weren't emitted (e.g. the RBA workflow's dangling stages edges) renders those as "unresolved" spans — correct surface behavior, an upstream extraction-data gap, not a surface bug.
    • (c) astro dev + pnpm build cannot run concurrently (the watch holds the output dir).
  • Two-way vs. durable. The env-pointed KB source, the sidebar generator, and the external build invocation are all swappable/removable without touching the OKF repo or the runtime core (two-way door). The durable commitments are KB-agnostic-by-construction rendering, the runtime→site external-command seam, and sovereignty / zero-copy.

Review

Promote toward verified once the surface has rendered multiple distinct tenant KBs (confirm the env-selected source + derived sidebar hold across differently-shaped OKF repos), and once buildTenantSite is exercised on a tenant produced fully by the runtime loop end-to-end (provision → ingest → extract → render). Resolve open item (a) with a Windows handle-release/retry strategy if local builds keep getting blocked, and revisit whether unresolved-id spans (b) should surface a richer "extraction gap" affordance once a real client reviews their rendered KB.