Egress containment substrate — @anthropic-ai/sandbox-runtime, proxy-first; gVisor/Firecracker as an escalation tier, not the default
0071-egress-containment-substrate
- Reversibility
- two-way door
DEC-0071 — Egress containment substrate
Reversibility: two-way door — this picks a mechanism for an enforcement layer DEC-0059 explicitly ratified as swappable. The durable commitment is the ContainmentSubstrate seam (the OS-layer enforcer sits beside the model-layer EgressPolicy, depend on the interface) and the proxy-first posture; the substrate implementation behind the seam, the proxy vendor, and the kernel-isolation tier are each independently swappable.
Context
DEC-0059 ratified the architecture-over-detection stance and, in its §Review build note, the deny-by-default egress sandbox was built at the model layer (packages/runtime/src/egress-sandbox.ts, wired into the agentSdkTurnRunner transport in live-session.ts, proven by an offline red→green containment test). That note was explicit that the model-layer posture backs but is not the real guarantee: the OS-level enforcement (--network none + an allowlisting proxy, plus kernel isolation against sandbox escape) remained unbuilt, and DEC-0059 floated gVisor/Firecracker while ratifying every mechanism behind it as a two-way door. This decision picks that mechanism. It is the OS-layer counterpart to the already-built model layer — the work this ADR authorizes, not yet performed.
The build tasks DEC-0059 filed — Deny-by-default egress sandbox around the extraction agent — break the lethal trifecta so a hijacked agent structurally cannot exfiltrate (the single load-bearing build finding) (p0, the model-layer scope in review) and Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant (p1) — both named "OS sandbox / --network none / domain-allowlist proxy / gVisor-or-Firecracker" as a deploy-time control whose mechanism was unmade. This ADR is the thing that specifies that mechanism for them.
Options considered
- Bespoke gVisor/Firecracker build (rejected as the default). Hand-roll the OS sandbox directly on a kernel-isolation runtime (gVisor
runscor a Firecracker microVM) with our own allowlist-proxy plumbing. Rejected as the default substrate because hand-rolling what Anthropic ships is the exact anti-pattern DEC-0004 exists to prevent — Anthropic shipped@anthropic-ai/sandbox-runtime(June 2026) covering deny-by-default network, namespace removal, and OS-primitive fs isolation. Kernel isolation is retained — but as an escalation tier reachable through the same substrate (a container-runtime swap), because it defends a different threat (kernel-exploit sandbox escape), not the day-one egress control. - Cross-platform self-built containment boundary (rejected). Build a containment layer that runs natively everywhere including native Windows, to keep dev and prod on one code path. Rejected: it re-invents the primitive (anti-DEC-0004), and the OS isolation primitives differ per platform (bubblewrap on Linux,
sandbox-execon macOS, no native win32 equivalent) — a uniform self-built boundary would either be weakest-common-denominator or a large bespoke surface. We instead seam over the platform gap (aNoopon win32 that honestly reports unenforced) rather than paper over it. @anthropic-ai/sandbox-runtime, proxy-first, kernel isolation as an escalation tier (chosen). Adopt the Anthropic primitive as the substrate; make the deny-by-default allowlist proxy the load-bearing egress control; reserve gVisor/Firecracker as a per-tenant escalation, not the default; seam over the win32-dev gap with an honestNoop.
Decision
Adopt @anthropic-ai/sandbox-runtime as the OS-layer containment substrate, proxy-first, with gVisor/Firecracker as an escalation tier — behind one new ContainmentSubstrate seam beside the existing model-layer EgressPolicy.
- Substrate =
@anthropic-ai/sandbox-runtime(Anthropic-shipped, June 2026): thesrtCLI + a programmaticSandboxManagerfor Node/TS — deny-by-default network via a host-side allowlist proxy over a Unix socket,--network none-equivalent namespace removal, and OS-primitive filesystem isolation (bubblewrap on Linux,sandbox-execon macOS). Choosing the Claude primitive over a bespoke build is required by DEC-0004. - Proxy-first posture. The deny-by-default allowlist proxy is the load-bearing egress control — it is what a hijacked agent hits when it tries to exfiltrate.
--network none+ the namespace are what force all traffic through that proxy (no proxy bypass). Kernel isolation (gVisor/Firecracker) is an escalation tier, not the default — it protects against a different threat (kernel-exploit sandbox escape, since OS sandboxes share the host kernel), reachable as a container-runtime swap (--runtime=runsc) for a tenant whose threat model demands it. - TLS caveat. sandbox-runtime's built-in proxy does not TLS-terminate (a domain-fronting bypass risk). The regulated tier swaps in a TLS-terminating proxy (Envoy / mitmproxy / equivalent) over the same socket — a config swap on the same substrate, not a new substrate. The proxy vendor is explicitly NOT chosen here (left to a spike — see Risk 1).
- Windows-dev reality. sandbox-runtime supports macOS + Linux, not native Windows (WSL2 counts as Linux). So the substrate runs natively in CI/prod (Linux) and is simulated behind the seam on the win32 dev box via a
Noopimplementation that MUST reportenforced: falseand name the gap — no fabricated containment, non-negotiable (the same honesty rule the model-layerNoopalready follows). Offline-by-construction + Windows-runnable tests survive because the default substrate in test/CI isNoop.
The seam shape (the architecture this records)
One new seam, ContainmentSubstrate (the OS-layer enforcer), sitting BESIDE the existing model-layer EgressPolicy in packages/runtime/src/egress-sandbox.ts — not on top of it. The two layers compose: the model layer constrains tools/output/permissions; the substrate constrains the process, network, and filesystem at the OS.
| Implementation | Where it runs | Reports |
|---|---|---|
NoopSubstrate |
win32 dev box + offline/CI tests (the default in test/CI) | enforced: false + names the gap — no fabricated containment |
SandboxRuntimeSubstrate |
CI/prod (Linux / macOS) — the real substrate | enforced: true; proxy-first allowlist + --network none + bubblewrap/sandbox-exec fs isolation |
ContainerSubstrate (RESERVED) |
the escalation tier — hardened container / runsc / Firecracker |
reserved; container-runtime swap for a tenant whose threat model needs kernel isolation |
Plug points (all existing seams — no new vocabulary):
- The substrate wraps the
LiveTurnRunnerprocess inlive-session.ts(agentSdkTurnRunner) — the OS half of the model-layer posture. - Its filesystem allow-list is derived from
isolation.ts'sconfineToTenant— no second confinement vocabulary;confineToTenantgraduates from a path-string check to the source of the OS sandbox's fs boundary. - Substrate selection is an
Orchestratorconcern (orchestrator.tsAgentSdkOrchestrator), read from the tenant manifest (provision.ts, likeTenantBudget) — not ambient.
Rationale
- DEC-0004 makes the substrate choice forced, not free. The day Anthropic ships
@anthropic-ai/sandbox-runtime, building a bespoke gVisor/Firecracker egress sandbox to do the same job is the precise anti-pattern DEC-0004 exists to prevent. So the default substrate is the primitive; kernel isolation survives as an escalation, not as a parallel hand-built stack. - Proxy-first because the proxy is what the attack hits. The threat DEC-0059 designs against is a hijacked agent attempting exfiltration. The control that meets that attack is the deny-by-default allowlist proxy;
--network none+ the namespace exist to make the proxy unbypassable. Kernel isolation meets a different attacker (one exploiting the shared host kernel to escape the sandbox) — real, but second-order, so it is the escalation tier, not the day-one default. Ranking the proxy first keeps the build focused on the load-bearing control. - Seam beside, not on top, keeps the two layers honest. The model layer (
EgressPolicy) and the OS layer (ContainmentSubstrate) constrain different things (tools/permissions/output vs process/network/fs). Composing them as peers behind one seam means neither silently substitutes for the other — and theenforced:flag makes "is the OS layer actually on?" a first-class, queryable fact rather than an assumption. - An honest
Noopis the only acceptable win32 story. The substrate cannot enforce on native Windows. Fabricating containment there (aNoopthat claimedenforced: true) would be exactly the "model-trusted not OS-guaranteed" lie DEC-0059 forbids. So the win32Noopreportsenforced: falseand names the gap — the same discipline as the model-layerNoopand theenableWeakerNestedSandboxhonesty rule (Risk 5). asserted, notverified— this authorizes work, it does not report it. The OS-layer enforcement is not yet built; only the model layer (egress-sandbox.ts) is. This ADR is the direction (accepted/active) for the OS layer; the substrate, the real proxy, and the kernel tier are unbuilt. Marking anythingverifiedhere would be fabricated status. The promotion gate is named in Review.
Consequences
- It sharpens DEC-0059's two open OS-layer build tasks — both are re-pointed at this ADR as the thing that specifies their mechanism (status unchanged): Deny-by-default egress sandbox around the extraction agent — break the lethal trifecta so a hijacked agent structurally cannot exfiltrate (the single load-bearing build finding) (the OS-level
--network none+ TLS-terminating allowlist proxy + kernel-isolation residual it carries is now mechanized here) and Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant (its process/network boundary is realized by the substrate wrapping the per-tenantLiveTurnRunner). - It picks a mechanism without re-deciding DEC-0059's stance. DEC-0059's untrusted-by-default principle stands; this fills the one slot it deferred. The model-layer build (
egress-sandbox.ts) is untouched — the substrate composes beside it. - It draws the prod-topology question to the surface (unsettled — Risk 2). A long-lived bubblewrap-sandboxed process + a host-side proxy socket cannot run on Vercel serverless; the contained runtime needs a Linux container host, which is topology-adjacent to Agentic-agency runtime topology — compile personas from the OKF graph and activate the reserved BoardWorker over the deterministic spine's per-tenant fleet. This ADR names the dependency; it does not settle the host.
- It must not silently relax Inv 4. Running a per-tenant process under the substrate must NOT activate the deferred intra-tenant worktree parallelism door — DEC-0062's serialization invariant (Inv 4) holds; the substrate isolates a tenant process, it does not authorize concurrent per-task processes (Risk 3).
- The vector index stays out of scope. DEC-0059 (a) names per-tenant vector-namespace isolation, but this ADR contains the process, not the index — index isolation remains Per-tenant runtime isolation — make the tenant a process/network/key boundary (not a directory), with a per-tenant vector namespace + server-side tenant binding, so a poisoned/sensitive atom is contained to ONE tenant's scope (Risk 4).
Open risks (explicitly NOT decided here)
- TLS-terminating proxy vendor unmade. The substrate takes a vendor-agnostic proxy endpoint; do not hard-wire Envoy/mitmproxy/etc. — left to a spike.
- Prod host for the contained runtime unsettled. Vercel serverless cannot host a long-lived bubblewrap-sandboxed process + host-side proxy socket; this needs a Linux container host (topology-adjacent to Agentic-agency runtime topology — compile personas from the OKF graph and activate the reserved BoardWorker over the deterministic spine).
- Per-tenant process must NOT activate intra-tenant parallelism. DEC-0062 Inv-4 serialization must hold — the substrate isolates one tenant process, not concurrent per-task processes.
- Vector-namespace isolation is named by DEC-0059 but NOT mechanized here. The substrate contains the process, not the index.
enableWeakerNestedSandboxfootgun. If the runtime runs inside a container in prod, the substrate may have to weaken; it must report when running weakened — the same honesty rule asNoop(never claimenforced: truewhile degraded).
Review
status: active (the direction is accepted) · confidence: asserted (the OS-layer enforcement is not yet built — this ADR authorizes it; only the model layer exists).
Promotion gate (asserted → verified): the SandboxRuntimeSubstrate is built behind the ContainmentSubstrate seam and demonstrated on Linux to (1) force all egress through the deny-by-default allowlist proxy with --network none (a simulated injection's exfil attempt blocked at the OS, not model-trusted), (2) derive its fs allow-list from confineToTenant (a read/write outside the tenant boundary denied by the OS sandbox, not by a path-string check), and (3) report enforced: true on Linux/macOS and enforced: false on the win32 Noop — with the offline/CI suite green on the Noop default. The escalation tier (ContainerSubstrate / --runtime=runsc), the TLS-terminating proxy vendor, and the prod container host remain reserved past that gate (Risks 1–2). Until then this is the authorized mechanism, not a verified control.