Persistence vs Freshness: Codex and Gemini CLI Tighten Two Different Boundaries for Agent Workflows

On March 4, 2026, two quiet merges landed on opposite sides of the agentic tooling ecosystem—both aimed at making “hands-off” workflows less surprising. OpenAI’s Codex and Google’s Gemini CLI each fixed a boundary problem that only shows up once you trust an agent to run for a while: one boundary is about what is allowed to persist, the other about what is allowed to become stale.

In openai/codex PR #13467 (merged 2026-03-04), Codex adjusted how its “memories” directory behaves under sandboxing. The headline change is simple: even when Codex is running in a workspace-write sandbox mode (a mode that constrains what paths are writable), the “memories root” is now explicitly made writable as part of the workspace-write writable roots configuration. In the same PR, Codex also tightened the rules for clearing memories: if the memory root is symlinked, the “clear memory” operation refuses to proceed.

It reads like housekeeping until you imagine the scene it’s meant to prevent. An agent is grinding through a multi-step refactor, taking notes, recording context, and building up a thread of what it learned—exactly what a “memory” feature is for. But sandbox settings don’t care about intent; they care about paths. If the memory store lives outside the workspace’s allowed write roots, the agent can suddenly become an amnesiac in the middle of a long run: it can read what it wrote earlier but can’t update it, or it can’t create new memories at all. The PR addresses that by granting the memories root a clear, sanctioned spot in the writable set for workspace-write mode.

The second change in Codex is less about convenience and more about safety rails. “Clear memory” is a destructive operation; if a memory root were symlinked, a path that looks like “safe internal storage” could actually point somewhere else—potentially somewhere unintended. By refusing to clear memories when the memory root is symlinked, Codex is drawing a bright line: persistent state must be anchored to a real directory, not a redirect that could make deletion behave like a trapdoor. That protects the persistence boundary: what counts as “the agent’s durable state,” and what operations are allowed to mutate or destroy it.

On the same day, google-gemini/gemini-cli PR #21050 (merged 2026-03-04) solved a different kind of boundary failure in agent tooling: the freshness of tool discovery when connecting to MCP servers.

Gemini CLI’s PR is titled “Notifications/tools/list_changed support not working,” and its fixes target two failure modes that are classic in distributed, event-driven systems. First, Gemini CLI now registers list_changed handlers even when the server’s declared capabilities don’t advertise listChanged. In other words: even if a server forgets (or declines) to tell the client it can emit “the tool list changed” notifications, Gemini CLI listens anyway. The client is choosing robustness over strict adherence to a declaration contract, because in the real world MCP servers may be inconsistent—and agent workflows depend on tools appearing when they’re available.

Second, Gemini CLI adds retry logic for discovery to avoid race conditions: if no changes are found, it retries after 500ms. This is the kind of timing edge that humans rarely see, but agents hit constantly. Agents start servers, trigger tool registration, and immediately ask “what tools do you have?” A server might register tools a moment later than the client’s first check, leading to a misleading “no tools” snapshot. The 500ms retry is a pragmatic hedge against that: it doesn’t slow everything down dramatically, but it makes initial discovery far less fragile.

Where Codex is defending persistence boundaries, Gemini CLI is defending freshness boundaries. That philosophical split matters because agent workflows fail in two distinct and expensive ways.

Failure mode one is persistence drift: an agent thinks it’s accumulating durable context (memories), but sandbox rules or unsafe path handling cause writes to silently fail, or destructive actions to hit the wrong location. The cost is compounding: the longer the run, the more the agent depends on the continuity of its memory. Codex’s change is a statement that memory is not an incidental cache; it’s a first-class workflow artifact that deserves explicit write authorization and careful deletion semantics. Making the memories root writable under workspace-write mode reduces “it worked yesterday” breakage when moving between environments. Refusing to clear symlinked memory roots reduces the risk that a cleanup command becomes a catastrophic delete outside its intended scope.

Failure mode two is freshness collapse: the agent is technically connected to a tool ecosystem, but it’s operating on a stale or incomplete list of capabilities. Here, even a perfect memory can’t save you, because the agent’s plan is built on incorrect assumptions about what actions are possible right now. Gemini CLI’s changes treat “tool list accuracy over time” as the thing to protect. By registering notifications even without explicit capability declaration, it reduces the chances that updates are missed due to imperfect server metadata. By retrying discovery after 500ms when no changes are found, it acknowledges that the world is asynchronous—and an agent that moves at machine speed will outrun naive single-shot initialization.

In practice, these boundaries reinforce different forms of trust. Codex is saying: trust that what you store as memory will remain in the right place and that destructive operations won’t be tricked by filesystem indirection. Gemini CLI is saying: trust that the world you’re acting in—your available tools—will be kept up to date even when servers are sloppy or timing is tight.

Put together, the day’s merges form a neat lesson for anyone building agentic workflows: reliability is not just about preventing crashes. It’s about aligning the agent’s internal continuity (persistence) with its external reality (freshness). When either boundary is leaky, the agent becomes unpredictable in exactly the ways that make automation feel risky: it forgets what it learned, or it acts as if tools don’t exist. Codex and Gemini CLI fixed different cracks, but they’re both shoring up the same promise: that an agent can run longer, faster, and more autonomously without drifting into confusion.

Persistence vs Freshness: Codex and Gemini CLI Tighten Two Different Boundaries for Agent Workflows

Sources

Receipts below the story

Source Trail

Evidence Limits

Send a note to the desk

Persistence vs Freshness: Codex and Gemini CLI Tighten Two Different Boundaries for Agent Workflows

Sources

Receipts below the story

Source Trail

Evidence Limits

Atlas Context

The Human Control Loop

Send a note to the desk