Fix git-per-tenant isolation when a tenant root is nested inside another repo

0020-git-isolation-nested-tenant-repo

decision read as Explain confidence verified status active 2026-06-15 owner platform-engineer
Reversibility
two-way door

DEC-0020 — Git isolation fix: nested-tenant repo init guard

Reversibility: two-way door — the guard predicate is an internal implementation detail; the durable commitment is the one-client-one-repo isolation invariant it protects.

Context

Runtime orchestration & per-tenant control plane — the learning loop becomes a runnable system established git-per-tenant: after each extract, @dossier/runtime runs git init + git add -A && git commit so every loop iteration is a diff in the client's own git history — the concrete realization of Adopt OKF as Dossier's canonical knowledge format (the client owns the repo; it is the system of record) and the sovereignty guarantee of Dossier — Mission & North Star.

This bug was found via ground-truth verification, not a failing test: the first full external website run (see the related milestone) reported commit:ok but the 48 atoms were not committed to the tenant repo — instead a stray commit landed on Dossier's own main. Root cause: git.ts ensureRepo guarded on isGitRepo (git rev-parse --is-inside-work-tree), which is true for any path nested inside an enclosing repo. The test harness placed the tenant subtree inside the Dossier working tree, so ensureRepo saw "already inside a work tree," skipped git init, and commitAll then operated on the parent repo. This silently breaks one-client-one-repo isolation — and the sovereignty guarantee — whenever a tenant root sits inside another git repo (e.g. an agency's own working tree, the exact real-world deployment shape).

Options considered

  1. Document "provision tenants outside any repo" — make it an operator constraint. Rejected: it pushes a silent-data-loss footgun onto the operator, and the realistic deployment (an agency working tree containing client subtrees) is precisely the broken case. Sovereignty can't depend on the operator never nesting.
  2. Always git init the tenant dir unconditionally — drop the guard. Rejected: not idempotent; re-init churn and edge cases on an already-correct tenant repo.
  3. Guard on isRepoRoot instead of isGitRepo (chosen). Distinguish "inside a work tree" from "is the root of its own work tree." ensureRepo should git init unless dir is already its own repo root — so a nested tenant gets its own isolated repo, and an existing tenant repo is left alone (idempotent).

Decision

Add isRepoRoot(dir) and guard ensureRepo on it instead of isGitRepo.

  • isRepoRoot(dir) runs git rev-parse --show-toplevel and returns true iff the repo's toplevel equals dir (compared via resolve + relative so the check is slash-normalized cross-platform). This is the isolation-critical distinction: isGitRepo (--is-inside-work-tree) is true for any nested path; isRepoRoot is true only at the actual repo root.
  • ensureRepo guards on isRepoRoot (if (await isRepoRoot(dir)) return;), so a tenant nested inside another repo gets its own git init, while an already-correct tenant repo is a no-op (idempotent).
  • isRepoRoot is exported from the runtime barrel (packages/runtime/src/index.ts) so callers/tests can assert isolation directly.
  • Regression test (packages/runtime/test/git-isolation.test.ts) actually inits a parent repo, nests a tenant dir inside it, asserts the pre-fix observation (isGitRepo(nested) === true but isRepoRoot(nested) === false), ensureRepos the tenant, commits an atom, and asserts the parent has no commits and tracks none of the tenant's files.
  • The accidental commit on Dossier's main was reverted (main back to b74b9b6); the capstone run then proved provision → commit lands in the isolated tenant repo with main unchanged.

Rationale

  • It protects the sovereignty guarantee at its weakest real-world point. Adopt OKF as Dossier's canonical knowledge format / Runtime orchestration & per-tenant control plane — the learning loop becomes a runnable system promise one client = one repo = their own history. The realistic deployment is an agency working tree that contains client subtrees — exactly the case the old guard broke. isRepoRoot makes the guarantee hold there.
  • Ground truth over a green claim. The run reported commit:ok; reality was a misdirected commit. Verifying where the commit landed (not just that one happened) is what surfaced this — a reported-green that didn't hold.
  • Minimal, idempotent, correct. Swapping the predicate is the smallest change that fixes the class of bug without unconditional re-init churn; the new check is purely about repo-root identity.
  • verified, not asserted. This is a reproduced fix: the failure was observed in a real run, the root cause confirmed in code, a regression test now reproduces the broken precondition and proves the parent stays untouched, and the capstone run demonstrated correct isolation end-to-end (commit into the tenant repo, main unchanged). Evidence exists, so confidence is verified — the level of validation is for the mechanism, not the platform's market fit.

Consequences

  • Nested tenants are now correctly isolated. A tenant subtree inside any enclosing repo gets its own git repo; per-tenant commits land in the tenant's history, never the parent's.
  • Provenance/audit integrity restored at the VCS layer. The "every loop iteration is a diff in the client's own history" property (Runtime orchestration & per-tenant control plane — the learning loop becomes a runnable system) now holds even under nesting — the audit trail can't silently leak into an enclosing repo.
  • A standing regression guard. git-isolation.test.ts will fail if the guard ever regresses to isGitRepo; isRepoRoot is part of the runtime's public surface.
  • Two-way vs. durable. The guard predicate is an internal implementation detail (two-way door). The durable commitment is the one-client-one-repo isolation invariant it protects.

Review

No scheduled revisit — this is a closed fix with a regression test. Re-examine only if the runtime's VCS layer changes shape (e.g. supporting submodules, worktrees, or a non-git VCS), where "is this dir its own repo root" may need a richer definition.