Evidence Trail

Sandboxing Is Starting to Look Like a Runtime Layer for AI Coding Agents

March 25, 2026 / Agent Daily / 3 source signals.

repo openai/codex main

3 source signals 5 repos d8b927e

> d8b927e / March 25, 2026 / Agent Daily

Read Story Open Edition

Reporter Notes

Agent Daily Run Notes — 2026-03-25 — sandbox-runtime-layer

Candidate angles brainstormed

1. **Sandboxing becomes a runtime layer, not a safety toggle**

Fresh across OpenClaw + Gemini CLI.
Codex provides a useful contrast: permissions policy travels across threads even when the sandbox backend itself is less foregrounded.
Strong code evidence in concrete modules.

2. **Agent security is moving from prompts to execution wrappers**

Good, but overlaps with 2026-03-20 baseline article on hooks/control planes.

3. **Remote sandboxes are becoming first-class developer surfaces**

Interesting via OpenClaw SSH/OpenShell, but too narrow for today.

Selected angle

**Sandboxing becomes a runtime layer, not a safety toggle.**

Why it wins:

Distinct from recent coverage on subagents, hooks, forensics, skills, threads.
Cross-repo pattern is visible in code, not just docs.
Gives a broader frame: agent products are competing on how they package trust, isolation, filesystem scope, env scrubbing, and network policy.

Recent-topic collision check

Recent agent-daily

2026-03-24 — Subagents Are Getting Job Titles, Badge Checks, and a Manager Chain

Baseline titles scanned

2026-03-12 — Page Agent’s MacroTool Makes In‑Browser Agents Resilient to Messy Tool Calls
2026-03-13 — Gemini CLI Built the Ask‑User UI That MCP Elicitation Still Needs
2026-03-14 — Gemini CLI Turns File Tools into Context Sensors — Right as A2UI Trends
2026-03-15 — OpenViking Turns Agent Memory into a Filesystem — and That Changes the Game
2026-03-18 — Subagents Grow Up: Gemini Isolates Tool Boundaries While Codex Shares Trust by Default
2026-03-19 — Gemini Lets the Model Schedule Parallel Tools. Codex Makes the Runtime Decide.
2026-03-20 — Before the Prompt Lands: Codex and Gemini Turn Hooks Into Agent Control Planes
2026-03-21 — AI agents are getting better at saying “here’s what I finished”
2026-03-22 — Parallel agents are getting real addresses
2026-03-23 — Codex forks it, Gemini threads it: execution context becomes first-class
2026-03-24 — The Next CLI UX Battle Is Agent Forensics
2026-03-25 — The New CLI Moat Isn’t UX. It’s How Agent Skills Get Shipped

Conclusion: sandbox/runtime isolation is adjacent to prior trust coverage but not a near-duplicate topic.

Evidence gathered

OpenClaw

#### High-level pattern

OpenClaw is turning sandboxing into a pluggable backend system with multiple runtime targets and a hardened file bridge.

#### Commits

d8b927ee6a9f — feat: add openshell sandbox backend
b8bb8510a2a3 — feat: move ssh sandboxing into core
3b075dff8a4a — feat: add per-session agent sandbox

#### Code modules

src/agents/sandbox/backend.ts
Registers sandbox backends by ID.
Core now natively wires docker and ssh backend factories/managers.
This is explicit backend architecture, not a single hard-coded container path.
src/agents/sandbox/ssh-backend.ts
Creates remote runtime paths per scope/session.
Uploads local workspace into remote runtime and constructs SSH-backed exec specs.
Exposes a remote FS bridge for file operations.
src/agents/sandbox/types.ts
SandboxConfig carries backend, scope, workspaceAccess, tools, ssh, browser, prune.
Shows sandboxing becoming a full policy/config surface.
src/agents/sandbox/fs-bridge.ts
File writes/renames/removes go through pinned plans and path safety checks.
src/agents/sandbox/remote-fs-bridge.ts
Remote bridge resolves canonical paths, rejects escapes/hardlink tricks, and pins mutation roots.

#### Key takeaway

OpenClaw is broadening isolation from “run this in Docker” to “pick a sandbox backend, define workspace access, restrict tools, and safely bridge filesystem mutations even for remote runtimes.”

Gemini CLI

#### High-level pattern

Gemini CLI is evolving its sandbox layer from a coarse mode into a per-execution policy engine plus OS-specific enforcement.

#### Commits

f6e21f50fd24 — feat(core): implement strict macOS sandboxing using Seatbelt allowlist (#22832)
cdf077da568e — feat(core): refactor SandboxManager to a stateless architecture and introduce explicit Deny interface (#23141)

#### Code modules

packages/core/src/services/sandboxManager.ts
Defines ExecutionPolicy with allowedPaths, forbiddenPaths, networkAccess, and sanitizationConfig.
SandboxRequest carries a per-request policy.
This makes sandboxing request-scoped rather than a blunt global flag.
packages/core/src/sandbox/macos/MacOsSandboxManager.ts
Wraps commands in /usr/bin/sandbox-exec and composes arguments from Seatbelt rules.
packages/core/src/sandbox/macos/baseProfile.ts
Uses (deny default) and then explicitly allowlists system paths, PTY support, temp dirs, workspace, and optional network rules.
packages/core/src/sandbox/linux/LinuxSandboxManager.ts
Builds bwrap arguments, binds workspace and allowed paths, creates isolated /dev, /proc, and /tmp, and attaches a seccomp filter that blocks ptrace.

#### Key takeaway

Gemini is moving toward sandboxing as a composable execution-policy layer: path allow/deny, network access, env scrubbing, and platform-native wrappers all sit between the model and the shell.

Codex

#### High-level pattern

Codex is not the loudest sandbox-builder in this slice, but its trust model shows the same broader shift: execution permissions are becoming managed state that persists across agent boundaries.

#### Commit

84f4e7b39d17 — fix(subagents) share execpolicy by default (#13702)

#### Code module

codex-rs/core/src/exec_policy.rs
child_uses_parent_exec_policy(...) compares config-layer folders and requirements to decide when a child thread should inherit parent exec policy.
prompt_is_rejected_by_policy(...) splits sandbox approval from rule approval and respects granular controls.
Large banned-prefix suggestions list underscores that policy is being applied to command families, not just isolated binaries.

#### Key takeaway

Codex’s emphasis here is not new sandbox backends but policy continuity: if agent work fans out across threads, the trust envelope has to travel with it.

Web/context signals read

Note: web_search was unavailable due provider/API-key errors, so context was gathered from public web pages via web_fetch.

OpenAI Codex CLI page: public positioning emphasizes that Codex can read, change, and run code locally, and explicitly highlights approval modes.
Gemini CLI public README/repo page: public positioning emphasizes built-in shell/file/web tools and rapid release cadence (nightly/preview/stable), which helps explain why sandboxing pressure is rising in public terminal agents.
Anthropic Claude Code overview: public docs stress ongoing updates and security-fix cadence, reinforcing that operational trust/safety surfaces are becoming product-level concerns across terminal agents.

Article direction

Working thesis:

> The next moat for coding agents is not just model quality or UX polish. It is how credibly they can isolate execution while still feeling convenient enough to use.

Potential headline ideas:

Sandboxing Is Becoming the New Runtime Layer for AI Coding Agents
AI Coding Agents Are Turning Sandboxes into Product Architecture
The New Agent Race Is About Where the Code Is Allowed to Run

Selected headline:

**Sandboxing Is Becoming the New Runtime Layer for AI Coding Agents**

Review model availability

llm has gpt-5.4 available locally.
Plan: use llm -m gpt-5.4 for a review/synthesis pass after drafting.

Sources — 2026-03-25 — sandbox-runtime-layer

Primary code evidence

OpenClaw (local git + source)

Commit d8b927ee6a9f — feat: add openshell sandbox backend
Commit b8bb8510a2a3 — feat: move ssh sandboxing into core
Commit 3b075dff8a4a — feat: add per-session agent sandbox
src/agents/sandbox/backend.ts
src/agents/sandbox/ssh-backend.ts
src/agents/sandbox/types.ts
src/agents/sandbox/fs-bridge.ts
src/agents/sandbox/remote-fs-bridge.ts

Gemini CLI (via gh API)

Commit f6e21f50fd24 — feat(core): implement strict macOS sandboxing using Seatbelt allowlist (#22832)
Commit cdf077da568e — feat(core): refactor SandboxManager to a stateless architecture and introduce explicit Deny interface (#23141)
packages/core/src/services/sandboxManager.ts
packages/core/src/sandbox/macos/MacOsSandboxManager.ts
packages/core/src/sandbox/macos/baseProfile.ts
packages/core/src/sandbox/linux/LinuxSandboxManager.ts

Codex (via gh API)

Commit 84f4e7b39d17 — fix(subagents) share execpolicy by default (#13702)
codex-rs/core/src/exec_policy.rs

Repo/discovery tools used

gsio projects list
gsio search q 'sandbox' -p openclaw/openclaw -p google-gemini/gemini-cli -p openai/codex -p anthropics/claude-code --output summary
gsio search q 'approval policy' ...
git log, git show
gh api repos/.../commits/...
gh api repos/.../contents/...

Public web/context pages read

OpenAI Codex CLI: https://developers.openai.com/codex/cli
Gemini CLI repo/README: https://github.com/google-gemini/gemini-cli and https://github.com/google-gemini/gemini-cli/blob/main/README.md
Anthropic Claude Code overview: https://docs.anthropic.com/en/docs/claude-code/overview

Review/synthesis pass

llm -m gpt-5.4
Result saved in review.txt