Egress containment substrate — @anthropic-ai/sandbox-runtime, proxy-first; gVisor/Firecracker as an escalation tier, not the default

0071-egress-containment-substrate

decision read as Explain confidence asserted status active 2026-06-21 owner principal-architect
Reversibility
two-way door

DEC-0071 — Egress containment substrate

Reversibility: two-way door — this picks a mechanism for an enforcement layer DEC-0059 explicitly ratified as swappable. The durable commitment is the ContainmentSubstrate seam (the OS-layer enforcer sits beside the model-layer EgressPolicy, depend on the interface) and the proxy-first posture; the substrate implementation behind the seam, the proxy vendor, and the kernel-isolation tier are each independently swappable.

Context

DEC-0059 ratified the architecture-over-detection stance and, in its §Review build note, the deny-by-default egress sandbox was built at the model layer (packages/runtime/src/egress-sandbox.ts, wired into the agentSdkTurnRunner transport in live-session.ts, proven by an offline red→green containment test). That note was explicit that the model-layer posture backs but is not the real guarantee: the OS-level enforcement (--network none + an allowlisting proxy, plus kernel isolation against sandbox escape) remained unbuilt, and DEC-0059 floated gVisor/Firecracker while ratifying every mechanism behind it as a two-way door. This decision picks that mechanism. It is the OS-layer counterpart to the already-built model layer — the work this ADR authorizes, not yet performed.

The build tasks DEC-0059 filed — Deny-by-default egress sandbox around the extraction agent — break the lethal trifecta so a hijacked agent structurally cannot exfiltrate (the single load-bearing build finding) (p0, the model-layer scope in review) and Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant (p1) — both named "OS sandbox / --network none / domain-allowlist proxy / gVisor-or-Firecracker" as a deploy-time control whose mechanism was unmade. This ADR is the thing that specifies that mechanism for them.

Options considered

  1. Bespoke gVisor/Firecracker build (rejected as the default). Hand-roll the OS sandbox directly on a kernel-isolation runtime (gVisor runsc or a Firecracker microVM) with our own allowlist-proxy plumbing. Rejected as the default substrate because hand-rolling what Anthropic ships is the exact anti-pattern DEC-0004 exists to prevent — Anthropic shipped @anthropic-ai/sandbox-runtime (June 2026) covering deny-by-default network, namespace removal, and OS-primitive fs isolation. Kernel isolation is retained — but as an escalation tier reachable through the same substrate (a container-runtime swap), because it defends a different threat (kernel-exploit sandbox escape), not the day-one egress control.
  2. Cross-platform self-built containment boundary (rejected). Build a containment layer that runs natively everywhere including native Windows, to keep dev and prod on one code path. Rejected: it re-invents the primitive (anti-DEC-0004), and the OS isolation primitives differ per platform (bubblewrap on Linux, sandbox-exec on macOS, no native win32 equivalent) — a uniform self-built boundary would either be weakest-common-denominator or a large bespoke surface. We instead seam over the platform gap (a Noop on win32 that honestly reports unenforced) rather than paper over it.
  3. @anthropic-ai/sandbox-runtime, proxy-first, kernel isolation as an escalation tier (chosen). Adopt the Anthropic primitive as the substrate; make the deny-by-default allowlist proxy the load-bearing egress control; reserve gVisor/Firecracker as a per-tenant escalation, not the default; seam over the win32-dev gap with an honest Noop.

Decision

Adopt @anthropic-ai/sandbox-runtime as the OS-layer containment substrate, proxy-first, with gVisor/Firecracker as an escalation tier — behind one new ContainmentSubstrate seam beside the existing model-layer EgressPolicy.

  • Substrate = @anthropic-ai/sandbox-runtime (Anthropic-shipped, June 2026): the srt CLI + a programmatic SandboxManager for Node/TS — deny-by-default network via a host-side allowlist proxy over a Unix socket, --network none-equivalent namespace removal, and OS-primitive filesystem isolation (bubblewrap on Linux, sandbox-exec on macOS). Choosing the Claude primitive over a bespoke build is required by DEC-0004.
  • Proxy-first posture. The deny-by-default allowlist proxy is the load-bearing egress control — it is what a hijacked agent hits when it tries to exfiltrate. --network none + the namespace are what force all traffic through that proxy (no proxy bypass). Kernel isolation (gVisor/Firecracker) is an escalation tier, not the default — it protects against a different threat (kernel-exploit sandbox escape, since OS sandboxes share the host kernel), reachable as a container-runtime swap (--runtime=runsc) for a tenant whose threat model demands it.
  • TLS caveat. sandbox-runtime's built-in proxy does not TLS-terminate (a domain-fronting bypass risk). The regulated tier swaps in a TLS-terminating proxy (Envoy / mitmproxy / equivalent) over the same socket — a config swap on the same substrate, not a new substrate. The proxy vendor is explicitly NOT chosen here (left to a spike — see Risk 1).
  • Windows-dev reality. sandbox-runtime supports macOS + Linux, not native Windows (WSL2 counts as Linux). So the substrate runs natively in CI/prod (Linux) and is simulated behind the seam on the win32 dev box via a Noop implementation that MUST report enforced: false and name the gap — no fabricated containment, non-negotiable (the same honesty rule the model-layer Noop already follows). Offline-by-construction + Windows-runnable tests survive because the default substrate in test/CI is Noop.

The seam shape (the architecture this records)

One new seam, ContainmentSubstrate (the OS-layer enforcer), sitting BESIDE the existing model-layer EgressPolicy in packages/runtime/src/egress-sandbox.tsnot on top of it. The two layers compose: the model layer constrains tools/output/permissions; the substrate constrains the process, network, and filesystem at the OS.

Implementation Where it runs Reports
NoopSubstrate win32 dev box + offline/CI tests (the default in test/CI) enforced: false + names the gap — no fabricated containment
SandboxRuntimeSubstrate CI/prod (Linux / macOS) — the real substrate enforced: true; proxy-first allowlist + --network none + bubblewrap/sandbox-exec fs isolation
ContainerSubstrate (RESERVED) the escalation tier — hardened container / runsc / Firecracker reserved; container-runtime swap for a tenant whose threat model needs kernel isolation

Plug points (all existing seams — no new vocabulary):

  1. The substrate wraps the LiveTurnRunner process in live-session.ts (agentSdkTurnRunner) — the OS half of the model-layer posture.
  2. Its filesystem allow-list is derived from isolation.ts's confineToTenant — no second confinement vocabulary; confineToTenant graduates from a path-string check to the source of the OS sandbox's fs boundary.
  3. Substrate selection is an Orchestrator concern (orchestrator.ts AgentSdkOrchestrator), read from the tenant manifest (provision.ts, like TenantBudget) — not ambient.

Rationale

  • DEC-0004 makes the substrate choice forced, not free. The day Anthropic ships @anthropic-ai/sandbox-runtime, building a bespoke gVisor/Firecracker egress sandbox to do the same job is the precise anti-pattern DEC-0004 exists to prevent. So the default substrate is the primitive; kernel isolation survives as an escalation, not as a parallel hand-built stack.
  • Proxy-first because the proxy is what the attack hits. The threat DEC-0059 designs against is a hijacked agent attempting exfiltration. The control that meets that attack is the deny-by-default allowlist proxy; --network none + the namespace exist to make the proxy unbypassable. Kernel isolation meets a different attacker (one exploiting the shared host kernel to escape the sandbox) — real, but second-order, so it is the escalation tier, not the day-one default. Ranking the proxy first keeps the build focused on the load-bearing control.
  • Seam beside, not on top, keeps the two layers honest. The model layer (EgressPolicy) and the OS layer (ContainmentSubstrate) constrain different things (tools/permissions/output vs process/network/fs). Composing them as peers behind one seam means neither silently substitutes for the other — and the enforced: flag makes "is the OS layer actually on?" a first-class, queryable fact rather than an assumption.
  • An honest Noop is the only acceptable win32 story. The substrate cannot enforce on native Windows. Fabricating containment there (a Noop that claimed enforced: true) would be exactly the "model-trusted not OS-guaranteed" lie DEC-0059 forbids. So the win32 Noop reports enforced: false and names the gap — the same discipline as the model-layer Noop and the enableWeakerNestedSandbox honesty rule (Risk 5).
  • asserted, not verified — this authorizes work, it does not report it. The OS-layer enforcement is not yet built; only the model layer (egress-sandbox.ts) is. This ADR is the direction (accepted/active) for the OS layer; the substrate, the real proxy, and the kernel tier are unbuilt. Marking anything verified here would be fabricated status. The promotion gate is named in Review.

Consequences

Open risks (explicitly NOT decided here)

  1. TLS-terminating proxy vendor unmade. The substrate takes a vendor-agnostic proxy endpoint; do not hard-wire Envoy/mitmproxy/etc. — left to a spike.
  2. Prod host for the contained runtime unsettled. Vercel serverless cannot host a long-lived bubblewrap-sandboxed process + host-side proxy socket; this needs a Linux container host (topology-adjacent to Agentic-agency runtime topology — compile personas from the OKF graph and activate the reserved BoardWorker over the deterministic spine).
  3. Per-tenant process must NOT activate intra-tenant parallelism. DEC-0062 Inv-4 serialization must hold — the substrate isolates one tenant process, not concurrent per-task processes.
  4. Vector-namespace isolation is named by DEC-0059 but NOT mechanized here. The substrate contains the process, not the index.
  5. enableWeakerNestedSandbox footgun. If the runtime runs inside a container in prod, the substrate may have to weaken; it must report when running weakened — the same honesty rule as Noop (never claim enforced: true while degraded).

Review

status: active (the direction is accepted) · confidence: asserted (the OS-layer enforcement is not yet built — this ADR authorizes it; only the model layer exists).

Promotion gate (assertedverified): the SandboxRuntimeSubstrate is built behind the ContainmentSubstrate seam and demonstrated on Linux to (1) force all egress through the deny-by-default allowlist proxy with --network none (a simulated injection's exfil attempt blocked at the OS, not model-trusted), (2) derive its fs allow-list from confineToTenant (a read/write outside the tenant boundary denied by the OS sandbox, not by a path-string check), and (3) report enforced: true on Linux/macOS and enforced: false on the win32 Noop — with the offline/CI suite green on the Noop default. The escalation tier (ContainerSubstrate / --runtime=runsc), the TLS-terminating proxy vendor, and the prod container host remain reserved past that gate (Risks 1–2). Until then this is the authorized mechanism, not a verified control.