Resolve 4 orphan-artifact graph errors from the RBA Firecrawl run (link a producing process or prune)

task-orphan-artifact-graph-errors-rba

task confidence inferred status done 2026-06-19 owner extraction-engineer
source log-auditor — surfaced recording DEC-0055 (the 4 graphIssues from the first live Firecrawl RBA crawl); closed from the reference-tenant QA pass (tenant commit `8229530`)

Resolve the 4 orphan-artifact graph errors from the RBA Firecrawl run

The first live FirecrawlConnector run against rbaconsulting.com (First live FirecrawlConnector run against a real client source — field evidence for the reserved web seam) completed the full loop with 494 atoms / 654 edges / 0 rejected, but reported 4 graphIssues — all one structural class: orphan artifacts. The produces edge is canonical on the producing process only pins this as a severity: error QA failure: every artifact must have exactly one producing process; "orphan" artifacts are a QA failure, not an accepted state.

The four — and the QA root-cause nuance (added 2026-06-19)

The multi-surface FDE pass refined why each is an orphan — it is not one cause:

  • agribusiness-grain-transport-automation-case-studyMIS-TYPED. A delivery outcome, so it belongs as an engagement / case-study record, not an artifact (a produces target). Re-type it; don't invent a producing process.
  • caleres-personalization-roadmapMIS-TYPED, same as above (delivery outcome → engagement/case-study).
  • employee-persona / employee-personasa singular/plural DUPLICATE that lost its producer. This overlaps the systemic dedup root cause in Make the learning loop dedup/reconcile at scale (collapse same-type duplicate clusters; default-on compounding); reconcile the pair to one canonical atom first, then resolve its producer (or prune if not a real deliverable).

Why a task, not a fix-in-place

These are structural gaps, not hallucinations — provenance is verified (e.g. the agribusiness artifact is grounded in /operations-and-business-leaders/). The fix is now three-way and a judgment per atom: re-type the two case studies to engagement, reconcile-then-link the persona duplicate, link/add a producing process or prune any genuine orphan. That framing — which are real deliverables, which are mis-typed outcomes, which to merge, which to drop — is the extraction/knowledge owner's call, hence a task. It couples to Fix extraction type-discipline — `system` used as a catch-all + non-slug ids (RBA run) (the same type-discipline gap) and Make the learning loop dedup/reconcile at scale (collapse same-type duplicate clusters; default-on compounding) (the persona dup). Scoped to the RBA tenant OKF (clients/rba/tenants-firecrawl/rba-consulting, a gitignored local sandbox per Fix git-per-tenant isolation when a tenant root is nested inside another repo). Filed by the log-auditor from the run; updated from the QA pass; confidence: inferred.

Resolution (2026-06-19, tenant commit 8229530)

DONE — validateGraph orphan-artifact errors 3 → 0 (the count entering this surgery was 3, the persona dedup having already resolved one orphan at the 509c38d dedup pass — DEC-0056 recorded orphan-artifacts 4→3). Closed backlog → done, applied per the three-way framing, all grounded:

  • 2 case studies re-typed artifactengagement (delivery outcomes, not produces targets): agribusiness-grain-transport-automation-engagement, caleres-personalization-engagement.
  • employee-persona kept artifact with a grounded produces from user-research-persona-development (source-matched producer — not invented).
  • 2 re-typed deliverables (design-concepts, sitemap) given grounded producers.
  • No fabricated producers — every producing process is source-grounded (provenance preserved, Adopt OKF as Dossier's canonical knowledge format / knowledge-model principle 8). okf tests 170/170 green.