Add a retry/repair path for extraction segments that fail on malformed model JSON

task-extraction-segment-retry-repair

task confidence inferred status backlog 2026-06-18 owner extraction-engineer

source log-auditor — surfaced recording DEC-0055 (segment

Add a retry/repair path for extraction segments that fail on malformed model JSON

The first live FirecrawlConnector RBA run (First live FirecrawlConnector run against a real client source — field evidence for the reserved web seam) made 203 extract calls with 1 failure: segment #82 returned malformed model JSON, so it came back ok=false / atoms=0 and its atoms were silently lost — there was no retry, no repair, and the only signal was the aggregate failure count.

Why it matters beyond one segment

One lost segment in a 75-page crawl is small, but the failure mode is systemic: a single transient or malformed model response drops a whole segment of a client's knowledge with no recovery and no loud signal. As crawls scale, this compounds.

Shape

Add a bounded retry/repair path in the extraction (or subscription-client — Subscription-backed extraction is a first-class transport — ClaudeCodeClient (no API keys)) layer: on a parse/validation failure, retry the segment (optionally with a JSON-repair or re-prompt step) before giving up, and surface an unrecoverable segment loudly, not only as a count. It must preserve the Live extraction eval harness — what we measure is what extraction optimizes for faithfulness floor (a repaired segment passes the same validation/judge) and stay offline-testable behind the Extraction runtime architecture — the moat injected-client seam (a unit test reproduces the malformed-JSON case). Filed by the log-auditor from the run; confidence: inferred.