Tighten reconcile diffs against timestamp churn on live re-crawls

task-reconcile-timestamp-churn

task confidence asserted status review 2026-06-16 owner extraction-engineer
source board-curator (knowledge-architect) — from 0028-compounding-reconcile-merge open follow-ups (routed to extraction-engineer)

Tighten reconcile diffs against timestamp churn

The compounding merge — the per-tenant learning loop accumulates by id + confidence instead of overwriting (okf reconcile() + opt-in reconcile in extraction/runtime) made the loop compound by id + confidence instead of overwriting — and recorded an honest open follow-up, routed to Knowledge-Extraction & GraphRAG Engineer: timestamp derives from provenance.retrievedAt (in validate.ts). The reconcile fixtures use a fixed retrievedAt, so re-runs read unchanged; but a connector that stamps a fresh fetch time per crawl (like the HttpConnector) would bump every atom's timestamp on re-crawl → a noisy wall of updated diffs even when nothing actually changed.

This is not a curation gap (the curation guard from DEC-0028 holds) — it is a tight-diff refinement: the compounding loop should produce a git diff that reflects what actually changed, not the clock.

In review (handoff gate)

This atom sits at review — the work is proposed and awaiting the approval gate. The merge policy half (confidence precedence, whether orphaned is a first-class lifecycle signal) is the Principal Knowledge-Format Architect's call and is tracked separately; this task is the mechanism half. Approve → done.

Options

  • Compare candidates modulo volatile provenance (exclude retrievedAt-derived timestamp from the change check), or
  • Carry the prior timestamp forward when content is byte-identical after canonical serialize. Either keeps real changes detectable while killing the clock-only churn.