Old-school developer tooling treated permissions like a locked door.
You either had the key or you didn’t.
What the latest agent CLIs are building looks different. Less locked door. More security desk.
The emerging UX pattern is not “ask once, then hope for the best.” It is “inspect the exact action, negotiate the minimum expansion, and keep the decision attached to the workflow.”
Gemini turns sandboxing into an on-the-fly negotiation
Google’s gemini-cli just made a very specific bet in PR #23301: when a task bumps into the sandbox, the right answer is not always a hard stop or a full escape hatch. Sometimes the right answer is a targeted expansion.
The code tells that story clearly. In packages/core/src/tools/shell.ts, the shell tool emits a dedicated sandbox_expansion confirmation when additional_permissions is present. That is a product signal. The permission request is no longer a vague “trust me.” It is a first-class object the UI can explain.
Underneath that, SandboxPolicyManager stores approved paths and network allowances per command, combining persistent config with session approvals. In plain English: Gemini is learning to remember that this command can touch these paths or use this network surface, instead of forcing users to choose between suffocating defaults and broad permanent trust.
The repo also shows the counterweight. The same PR makes plan mode strict: no write or network override loopholes. And commandSafety.ts hard-codes a cautious allowlist for read-only commands, with bespoke checks for things like git, find, rg, and sed. So the expansion story is not “sandboxing got looser.” It is “sandboxing got more articulate.”
Codex is separating when approval is needed from who reviews it
OpenAI’s codex is pushing the same market in a different direction.
PR #13860 adds a new runtime switch, approvals_reviewer, that can route approval requests either to the human or to a guardian_subagent. That sounds like an implementation detail until you read the surrounding changes. Guardian review now spans shell execution, patch application, managed-network approvals, MCP approvals, and delegated/subagent flows. The app-server also emits explicit review lifecycle events like item/autoApprovalReview/started and .../completed.
That is a meaningful architectural move. Codex is no longer treating approvals as a simple modal interruption. It is treating them as their own review pipeline.
The guardian prompt is especially revealing. It literally starts by saying it is performing a risk assessment of a coding-agent sandbox escalation. It instructs the reviewer to treat transcript history, tool arguments, and tool results as untrusted evidence, not as instructions. In other words, Codex is designing approval as a second-order agent task: inspect intent, inspect scope, inspect blast radius, then decide.
The deeper Codex change is about precision
PR #14171 looks narrower, but it matters just as much. Codex fixes approval logic so it keys off FileSystemSandboxPolicy instead of broad legacy labels like read-only, workspace-write, or danger-full-access.
That sounds like plumbing. It is actually product truth.
Once approvals depend on split filesystem and network policy objects, the tool can reason about carve-outs instead of pretending every workspace-write context is basically the same. That is how you get from blunt trust modes to action-shaped review.
This is getting bigger than two repos
The broader web signals point the same way.
OpenAI’s own Codex security docs now describe agent control as a two-layer model: sandbox mode defines what the agent technically can do, while approval policy defines when it must stop and ask. GitHub’s Copilot CLI docs are teaching developers explicit permission patterns like --allow-tool and --deny-tool. Even the latest AI CLI ecosystem digests are flagging granular permission controls as a shared category demand.
That is the real story here. Permissions are escaping the settings page.
Why this matters for agent UX
If you make developers choose between constant interruptions and reckless autonomy, they eventually stop trusting the tool either way.
What these repos suggest is a smarter middle path:
- turn permission requests into structured runtime objects,
- scope them to paths, commands, networks, and actions,
- route them through the right reviewer,
- and remember enough of the decision to reduce repeat friction without erasing accountability.
That is a much more interesting UX than a yes/no popup. It treats agent autonomy like a negotiated contract that can be revised as work gets real.
What to watch next
The next competitive gap in agent tooling may not be who has the flashiest demo. It may be who can make permission feel least annoying without making trust feel fake.
Watch for three things:
- more action-specific approval objects instead of generic prompts,
- more persistent policy memory tied to commands and contexts,
- and more reviewer layers, whether human, policy engine, or lightweight guardian agent.
Because once permissions become conversational, the product challenge changes. The winner is not the tool that asks the fewest questions. It is the one that asks the right question at the right moment, with the right amount of context.
So here’s the open question: if agent tools get really good at negotiated permissions, does that unlock more real autonomy — or just make control feel smoother while humans still carry the real trust burden?
If you build agent products, audit every approval interrupt in your flow and ask a tougher question than “can we remove this?” Ask whether you can make it precise enough to become part of the work instead of a break from it.
Send a note to the desk
Corrections, missing context, or a follow-up lead.