Evidence Trail

The Next Agent UX Moat Isn’t Speed. It’s Backpressure.

March 28, 2026 / Daily Edition / 8 source signals.

repo openai/codex main
8 source signals 2 repos source trail
> source trail / March 28, 2026 / Daily Edition
Read Story Open Edition

Reporter Notes

Daily article notes — 2026-03-28

Selected angle

**The next CLI moat is backpressure control.**

Core thesis: the interesting shift in Codex and Gemini CLI is not just more tools, more subagents, or more speed. It is that both are starting to treat user interruption, queueing, compression, resume, approvals, and mid-task steering as runtime design problems. The terminal agent is becoming a backpressure system.

Why this angle is new versus prior articles

  • Different from yesterday’s containment story: that piece was about safer delegated workers and stricter boundaries.
  • Different from the March 24 forensics piece: this is about live flow control while work is in progress, not post-hoc inspection.
  • Different from the March 23 execution-context piece: this is about what happens when the human and the runtime both want to act at once.
  • Different from the March 21 “agents stop failing silently” piece: this is specifically about queueing, compression chokepoints, resume semantics, and safe-point intervention.

Candidate angles considered

1. **Agent backpressure becomes the next CLI UX moat.** Chosen.

2. Agent runtimes are turning user interruptions into first-class events.

3. Memory is becoming scoped and routable. Rejected as too close to earlier memory articles.

Fresh signals scanned

Codex

  • PR #16062: stabilize zsh-fork approvals and resume --last
  • Issue #16068: custom model_context_window breaks auto-compaction after overflow
  • Issue #16060: proposed SIGUSR1 inbox for safe-point mid-task instruction injection
  • docs/tui-chat-composer.md: queueing behavior in steer mode when task already running
  • core/src/compact.rs: mid-turn compaction behavior, trimming, and warning on repeated compactions

Gemini CLI

  • Issue #24071: request to queue a message while compression is running
  • Issue #24064: user complaint about dramatic slowness
  • packages/a2a-server/src/agent/executor.ts: secondary execution loop can process a user message while task already executing
  • packages/a2a-server/src/agent/task.ts: pre-register pending tool calls, wait for pending tools, reflect background work state
  • docs/reference/keyboard-shortcuts.md: plan mode skipped when agent is busy
  • docs/reference/commands.md + docs/reference/configuration.md: /compress, model.compressionThreshold, and hooks.PreCompress

Supporting insight from llm review

The strongest non-redundant framing is that both products are converging on the same operational question: when the user tries to change course while the machine is already busy, what gets queued, what gets rejected, what gets summarized, and what resumes cleanly?

Article structure

1. Hook: agent demos used to sell raw horsepower; now the harder problem is keeping the lane clear when humans interrupt.

2. Codex evidence: queued steer mode, resume narrowing, approval timing hardening, compaction edge cases, explicit request for inbox-style intervention.

3. Gemini evidence: secondary execution loop, pending-tool scheduler, busy-state UI restrictions, operator-visible compression controls, active user pain around waiting during compression.

4. Synthesis: CLIs are becoming flow-control systems for human+agent concurrency.

5. Close: open question + CTA on whether better backpressure design makes agents trustworthy enough for longer-lived work.

Sources — 2026-03-28

Prior state and archive review

  • daily edition corpus/_state.json
  • Reviewed article titles from daily edition corpus/*/index.html

gsio

  • gsio projects scope --project openai/codex --project google-gemini/gemini-cli
  • Coverage: 100% for both repos in local index
  • Freshness: synced on 2026-03-28
  • gsio search q 'compression queue resume approvals subagent running task' -p openai/codex -p google-gemini/gemini-cli --output summary
  • Surfaced related indexed commits around thread-scoped approvals, subagent execution inheritance, safe concurrent commands, event-driven scheduling, and reactive subagent status

Codex repo truth

Repo: https://github.com/openai/codex

Local clone: repo source/codex

Code/docs

  • docs/tui-chat-composer.md:100-103
  • Evidence: when steer mode is enabled, Tab requests queuing if a task is already running; Enter submits immediately.
  • codex-rs/core/src/compact.rs:37-43
  • Evidence: mid-turn compaction must inject initial context before the last real user message.
  • codex-rs/core/src/compact.rs:96-114
  • Evidence: compaction is a distinct turn item and reuses one client session to survive retries within the compact turn.
  • codex-rs/core/src/compact.rs:140-159
  • Evidence: older thread items are trimmed to fit context window; overflow during compaction removes oldest history items and retries.
  • codex-rs/core/src/compact.rs:227-229
  • Evidence: repeated long-thread compactions can reduce accuracy; user is warned to start a new thread.

Fresh GitHub activity

  • PR #16062stabilize zsh-fork approvals and resume --last
  • URL: https://github.com/openai/codex/pull/16062
  • Evidence: keeps approval flow stable on macOS and narrows resume --last to top-level resumable thread sources instead of internal sub-agent rollouts.
  • Issue #16068Setting model_context_window in config.toml breaks auto-compaction (fill_to_context_window resets token counter)
  • URL: https://github.com/openai/codex/issues/16068
  • Evidence: custom context window can poison token accounting after overflow, preventing further auto-compaction.
  • Issue #16060feature: SIGUSR1 handler for mid-task instruction injection via inbox file
  • URL: https://github.com/openai/codex/issues/16060
  • Evidence: explicit request for orchestrator-to-agent communication at safe points instead of hard interruption.

Gemini CLI repo truth

Repo: https://github.com/google-gemini/gemini-cli

Local clone: repo source/gemini-cli

Code/docs

  • packages/a2a-server/src/agent/executor.ts:450-465
  • Evidence: if a task already has a pending execution, the executor processes the user message in a secondary execution loop and yields back to the original execution.
  • packages/a2a-server/src/agent/executor.ts:521-533
  • Evidence: tool call requests are batched, scheduled, then the executor explicitly waits for pending tools before continuing.
  • packages/a2a-server/src/agent/task.ts:189-219
  • Evidence: pending tool calls are registered, resolved, and awaited with explicit counts.
  • packages/a2a-server/src/agent/task.ts:755-764
  • Evidence: tool calls are pre-registered before async scheduling so pending work is visible immediately.
  • packages/a2a-server/src/agent/task.ts:1141-1158
  • Evidence: pure confirmation turns can keep the task in a working state while background activity continues.
  • docs/reference/keyboard-shortcuts.md:106 and 220-221
  • Evidence: plan mode is skipped when the agent is busy.
  • docs/reference/commands.md:98-103
  • Evidence: /compress is a first-class command that replaces chat context with a summary.
  • docs/reference/configuration.md:410-415
  • Evidence: model.compressionThreshold controls when compression triggers.
  • docs/reference/configuration.md:1154-1156
  • Evidence: hooks.PreCompress runs before chat history compression.

Fresh GitHub activity

Editorial synthesis

  • llm prompt -m gpt-5.4
  • Prompted with yesterday’s topic, archive titles, and fresh repo/GitHub evidence.
  • Outcome: strongest non-redundant angle was backpressure as the next CLI UX moat.