Egress containment substrate — @anthropic-ai/sandbox-runtime, proxy-first; gVisor/Firecracker as an escalation tier, not the default

0071-egress-containment-substrate

decision read as Explain confidence asserted status active 2026-06-21 owner principal-architect

Reversibility: two-way door

DEC-0071 — Egress containment substrate

Reversibility: two-way door — this picks a mechanism for an enforcement layer DEC-0059 explicitly ratified as swappable. The durable commitment is the ContainmentSubstrate seam (the OS-layer enforcer sits beside the model-layer EgressPolicy, depend on the interface) and the proxy-first posture; the substrate implementation behind the seam, the proxy vendor, and the kernel-isolation tier are each independently swappable.

Context

DEC-0059 ratified the architecture-over-detection stance and, in its §Review build note, the deny-by-default egress sandbox was built at the model layer (packages/runtime/src/egress-sandbox.ts, wired into the agentSdkTurnRunner transport in live-session.ts, proven by an offline red→green containment test). That note was explicit that the model-layer posture backs but is not the real guarantee: the OS-level enforcement (--network none + an allowlisting proxy, plus kernel isolation against sandbox escape) remained unbuilt, and DEC-0059 floated gVisor/Firecracker while ratifying every mechanism behind it as a two-way door. This decision picks that mechanism. It is the OS-layer counterpart to the already-built model layer — the work this ADR authorizes, not yet performed.

The build tasks DEC-0059 filed — Deny-by-default egress sandbox around the extraction agent — break the lethal trifecta so a hijacked agent structurally cannot exfiltrate (the single load-bearing build finding) (p0, the model-layer scope in review) and Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant (p1) — both named "OS sandbox / --network none / domain-allowlist proxy / gVisor-or-Firecracker" as a deploy-time control whose mechanism was unmade. This ADR is the thing that specifies that mechanism for them.

Options considered

Bespoke gVisor/Firecracker build (rejected as the default). Hand-roll the OS sandbox directly on a kernel-isolation runtime (gVisor runsc or a Firecracker microVM) with our own allowlist-proxy plumbing. Rejected as the default substrate because hand-rolling what Anthropic ships is the exact anti-pattern DEC-0004 exists to prevent — Anthropic shipped @anthropic-ai/sandbox-runtime (June 2026) covering deny-by-default network, namespace removal, and OS-primitive fs isolation. Kernel isolation is retained — but as an escalation tier reachable through the same substrate (a container-runtime swap), because it defends a different threat (kernel-exploit sandbox escape), not the day-one egress control.
Cross-platform self-built containment boundary (rejected). Build a containment layer that runs natively everywhere including native Windows, to keep dev and prod on one code path. Rejected: it re-invents the primitive (anti-DEC-0004), and the OS isolation primitives differ per platform (bubblewrap on Linux, sandbox-exec on macOS, no native win32 equivalent) — a uniform self-built boundary would either be weakest-common-denominator or a large bespoke surface. We instead seam over the platform gap (a Noop on win32 that honestly reports unenforced) rather than paper over it.
@anthropic-ai/sandbox-runtime, proxy-first, kernel isolation as an escalation tier (chosen). Adopt the Anthropic primitive as the substrate; make the deny-by-default allowlist proxy the load-bearing egress control; reserve gVisor/Firecracker as a per-tenant escalation, not the default; seam over the win32-dev gap with an honest Noop.

Decision

Adopt @anthropic-ai/sandbox-runtime as the OS-layer containment substrate, proxy-first, with gVisor/Firecracker as an escalation tier — behind one new ContainmentSubstrate seam beside the existing model-layer EgressPolicy.

Substrate = @anthropic-ai/sandbox-runtime (Anthropic-shipped, June 2026): the srt CLI + a programmatic SandboxManager for Node/TS — deny-by-default network via a host-side allowlist proxy over a Unix socket, --network none-equivalent namespace removal, and OS-primitive filesystem isolation (bubblewrap on Linux, sandbox-exec on macOS). Choosing the Claude primitive over a bespoke build is required by DEC-0004.
Proxy-first posture. The deny-by-default allowlist proxy is the load-bearing egress control — it is what a hijacked agent hits when it tries to exfiltrate. --network none + the namespace are what force all traffic through that proxy (no proxy bypass). Kernel isolation (gVisor/Firecracker) is an escalation tier, not the default — it protects against a different threat (kernel-exploit sandbox escape, since OS sandboxes share the host kernel), reachable as a container-runtime swap (--runtime=runsc) for a tenant whose threat model demands it.
TLS caveat. sandbox-runtime's built-in proxy does not TLS-terminate (a domain-fronting bypass risk). The regulated tier swaps in a TLS-terminating proxy (Envoy / mitmproxy / equivalent) over the same socket — a config swap on the same substrate, not a new substrate. The proxy vendor is explicitly NOT chosen here (left to a spike — see Risk 1).
Windows-dev reality. sandbox-runtime supports macOS + Linux, not native Windows (WSL2 counts as Linux). So the substrate runs natively in CI/prod (Linux) and is simulated behind the seam on the win32 dev box via a Noop implementation that MUST report enforced: false and name the gap — no fabricated containment, non-negotiable (the same honesty rule the model-layer Noop already follows). Offline-by-construction + Windows-runnable tests survive because the default substrate in test/CI is Noop.

The seam shape (the architecture this records)

One new seam, ContainmentSubstrate (the OS-layer enforcer), sitting BESIDE the existing model-layer EgressPolicy in packages/runtime/src/egress-sandbox.ts — not on top of it. The two layers compose: the model layer constrains tools/output/permissions; the substrate constrains the process, network, and filesystem at the OS.

Implementation	Where it runs	Reports
`NoopSubstrate`	win32 dev box + offline/CI tests (the default in test/CI)	`enforced: false` + names the gap — no fabricated containment
`SandboxRuntimeSubstrate`	CI/prod (Linux / macOS) — the real substrate	`enforced: true`; proxy-first allowlist + `--network none` + bubblewrap/`sandbox-exec` fs isolation
`ContainerSubstrate` (RESERVED)	the escalation tier — hardened container / `runsc` / Firecracker	reserved; container-runtime swap for a tenant whose threat model needs kernel isolation

Plug points (all existing seams — no new vocabulary):

The substrate wraps the LiveTurnRunner process in live-session.ts (agentSdkTurnRunner) — the OS half of the model-layer posture.
Its filesystem allow-list is derived from isolation.ts's confineToTenant — no second confinement vocabulary; confineToTenant graduates from a path-string check to the source of the OS sandbox's fs boundary.
Substrate selection is an Orchestrator concern (orchestrator.ts AgentSdkOrchestrator), read from the tenant manifest (provision.ts, like TenantBudget) — not ambient.

Rationale

DEC-0004 makes the substrate choice forced, not free. The day Anthropic ships @anthropic-ai/sandbox-runtime, building a bespoke gVisor/Firecracker egress sandbox to do the same job is the precise anti-pattern DEC-0004 exists to prevent. So the default substrate is the primitive; kernel isolation survives as an escalation, not as a parallel hand-built stack.
Proxy-first because the proxy is what the attack hits. The threat DEC-0059 designs against is a hijacked agent attempting exfiltration. The control that meets that attack is the deny-by-default allowlist proxy; --network none + the namespace exist to make the proxy unbypassable. Kernel isolation meets a different attacker (one exploiting the shared host kernel to escape the sandbox) — real, but second-order, so it is the escalation tier, not the day-one default. Ranking the proxy first keeps the build focused on the load-bearing control.
Seam beside, not on top, keeps the two layers honest. The model layer (EgressPolicy) and the OS layer (ContainmentSubstrate) constrain different things (tools/permissions/output vs process/network/fs). Composing them as peers behind one seam means neither silently substitutes for the other — and the enforced: flag makes "is the OS layer actually on?" a first-class, queryable fact rather than an assumption.
An honest Noop is the only acceptable win32 story. The substrate cannot enforce on native Windows. Fabricating containment there (a Noop that claimed enforced: true) would be exactly the "model-trusted not OS-guaranteed" lie DEC-0059 forbids. So the win32 Noop reports enforced: false and names the gap — the same discipline as the model-layer Noop and the enableWeakerNestedSandbox honesty rule (Risk 5).
asserted, not verified — this authorizes work, it does not report it. The OS-layer enforcement is not yet built; only the model layer (egress-sandbox.ts) is. This ADR is the direction (accepted/active) for the OS layer; the substrate, the real proxy, and the kernel tier are unbuilt. Marking anything verified here would be fabricated status. The promotion gate is named in Review.

Consequences

It sharpens DEC-0059's two open OS-layer build tasks — both are re-pointed at this ADR as the thing that specifies their mechanism (status unchanged): Deny-by-default egress sandbox around the extraction agent — break the lethal trifecta so a hijacked agent structurally cannot exfiltrate (the single load-bearing build finding) (the OS-level --network none + TLS-terminating allowlist proxy + kernel-isolation residual it carries is now mechanized here) and Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant (its process/network boundary is realized by the substrate wrapping the per-tenant LiveTurnRunner).
It picks a mechanism without re-deciding DEC-0059's stance. DEC-0059's untrusted-by-default principle stands; this fills the one slot it deferred. The model-layer build (egress-sandbox.ts) is untouched — the substrate composes beside it.
It draws the prod-topology question to the surface (unsettled — Risk 2). A long-lived bubblewrap-sandboxed process + a host-side proxy socket cannot run on Vercel serverless; the contained runtime needs a Linux container host, which is topology-adjacent to Agentic-agency runtime topology — compile personas from the OKF graph and activate the reserved BoardWorker over the deterministic spine's per-tenant fleet. This ADR names the dependency; it does not settle the host.
It must not silently relax Inv 4. Running a per-tenant process under the substrate must NOT activate the deferred intra-tenant worktree parallelism door — DEC-0062's serialization invariant (Inv 4) holds; the substrate isolates a tenant process, it does not authorize concurrent per-task processes (Risk 3).
The vector index stays out of scope. DEC-0059 (a) names per-tenant vector-namespace isolation, but this ADR contains the process, not the index — index isolation remains Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant's scope (Risk 4).

Open risks (explicitly NOT decided here)

TLS-terminating proxy vendor unmade. The substrate takes a vendor-agnostic proxy endpoint; do not hard-wire Envoy/mitmproxy/etc. — left to a spike.
Prod host for the contained runtime unsettled. Vercel serverless cannot host a long-lived bubblewrap-sandboxed process + host-side proxy socket; this needs a Linux container host (topology-adjacent to Agentic-agency runtime topology — compile personas from the OKF graph and activate the reserved BoardWorker over the deterministic spine).
Per-tenant process must NOT activate intra-tenant parallelism. DEC-0062 Inv-4 serialization must hold — the substrate isolates one tenant process, not concurrent per-task processes.
Vector-namespace isolation is named by DEC-0059 but NOT mechanized here. The substrate contains the process, not the index.
enableWeakerNestedSandbox footgun. If the runtime runs inside a container in prod, the substrate may have to weaken; it must report when running weakened — the same honesty rule as Noop (never claim enforced: true while degraded).

Review

status: active (the direction is accepted) · confidence: asserted (the OS-layer enforcement is not yet built — this ADR authorizes it; only the model layer exists).

Promotion gate (asserted → verified): the SandboxRuntimeSubstrate is built behind the ContainmentSubstrate seam and demonstrated on Linux to (1) force all egress through the deny-by-default allowlist proxy with --network none (a simulated injection's exfil attempt blocked at the OS, not model-trusted), (2) derive its fs allow-list from confineToTenant (a read/write outside the tenant boundary denied by the OS sandbox, not by a path-string check), and (3) report enforced: true on Linux/macOS and enforced: false on the win32 Noop — with the offline/CI suite green on the Noop default. The escalation tier (ContainerSubstrate / --runtime=runsc), the TLS-terminating proxy vendor, and the prod container host remain reserved past that gate (Risks 1–2). Until then this is the authorized mechanism, not a verified control.