Evidence Trail

The Next Agent UX Moat Isn’t Speed. It’s Backpressure.

March 28, 2026 / Daily Edition / 8 source signals.

repo openai/codex main

8 source signals 2 repos source trail

> source trail / March 28, 2026 / Daily Edition

Read Story Open Edition

Reporter Notes

Daily article notes — 2026-03-28

Selected angle

**The next CLI moat is backpressure control.**

Core thesis: the interesting shift in Codex and Gemini CLI is not just more tools, more subagents, or more speed. It is that both are starting to treat user interruption, queueing, compression, resume, approvals, and mid-task steering as runtime design problems. The terminal agent is becoming a backpressure system.

Why this angle is new versus prior articles

Different from yesterday’s containment story: that piece was about safer delegated workers and stricter boundaries.
Different from the March 24 forensics piece: this is about live flow control while work is in progress, not post-hoc inspection.
Different from the March 23 execution-context piece: this is about what happens when the human and the runtime both want to act at once.
Different from the March 21 “agents stop failing silently” piece: this is specifically about queueing, compression chokepoints, resume semantics, and safe-point intervention.

Candidate angles considered

1. **Agent backpressure becomes the next CLI UX moat.** Chosen.

2. Agent runtimes are turning user interruptions into first-class events.

3. Memory is becoming scoped and routable. Rejected as too close to earlier memory articles.

Fresh signals scanned

Codex

PR #16062: stabilize zsh-fork approvals and resume --last
Issue #16068: custom model_context_window breaks auto-compaction after overflow
Issue #16060: proposed SIGUSR1 inbox for safe-point mid-task instruction injection
docs/tui-chat-composer.md: queueing behavior in steer mode when task already running
core/src/compact.rs: mid-turn compaction behavior, trimming, and warning on repeated compactions

Gemini CLI

Issue #24071: request to queue a message while compression is running
Issue #24064: user complaint about dramatic slowness
packages/a2a-server/src/agent/executor.ts: secondary execution loop can process a user message while task already executing
packages/a2a-server/src/agent/task.ts: pre-register pending tool calls, wait for pending tools, reflect background work state
docs/reference/keyboard-shortcuts.md: plan mode skipped when agent is busy
docs/reference/commands.md + docs/reference/configuration.md: /compress, model.compressionThreshold, and hooks.PreCompress

Supporting insight from llm review

The strongest non-redundant framing is that both products are converging on the same operational question: when the user tries to change course while the machine is already busy, what gets queued, what gets rejected, what gets summarized, and what resumes cleanly?

Article structure

1. Hook: agent demos used to sell raw horsepower; now the harder problem is keeping the lane clear when humans interrupt.

2. Codex evidence: queued steer mode, resume narrowing, approval timing hardening, compaction edge cases, explicit request for inbox-style intervention.

3. Gemini evidence: secondary execution loop, pending-tool scheduler, busy-state UI restrictions, operator-visible compression controls, active user pain around waiting during compression.

4. Synthesis: CLIs are becoming flow-control systems for human+agent concurrency.

5. Close: open question + CTA on whether better backpressure design makes agents trustworthy enough for longer-lived work.

Sources — 2026-03-28

Prior state and archive review

daily edition corpus/_state.json
Reviewed article titles from daily edition corpus/*/index.html

gsio

gsio projects scope --project openai/codex --project google-gemini/gemini-cli
Coverage: 100% for both repos in local index
Freshness: synced on 2026-03-28
gsio search q 'compression queue resume approvals subagent running task' -p openai/codex -p google-gemini/gemini-cli --output summary
Surfaced related indexed commits around thread-scoped approvals, subagent execution inheritance, safe concurrent commands, event-driven scheduling, and reactive subagent status

Codex repo truth

Repo: https://github.com/openai/codex

Local clone: repo source/codex

Code/docs

docs/tui-chat-composer.md:100-103
Evidence: when steer mode is enabled, Tab requests queuing if a task is already running; Enter submits immediately.
codex-rs/core/src/compact.rs:37-43
Evidence: mid-turn compaction must inject initial context before the last real user message.
codex-rs/core/src/compact.rs:96-114
Evidence: compaction is a distinct turn item and reuses one client session to survive retries within the compact turn.
codex-rs/core/src/compact.rs:140-159
Evidence: older thread items are trimmed to fit context window; overflow during compaction removes oldest history items and retries.
codex-rs/core/src/compact.rs:227-229
Evidence: repeated long-thread compactions can reduce accuracy; user is warned to start a new thread.

Fresh GitHub activity

PR #16062 — stabilize zsh-fork approvals and resume --last
URL: https://github.com/openai/codex/pull/16062
Evidence: keeps approval flow stable on macOS and narrows resume --last to top-level resumable thread sources instead of internal sub-agent rollouts.
Issue #16068 — Setting model_context_window in config.toml breaks auto-compaction (fill_to_context_window resets token counter)
URL: https://github.com/openai/codex/issues/16068
Evidence: custom context window can poison token accounting after overflow, preventing further auto-compaction.
Issue #16060 — feature: SIGUSR1 handler for mid-task instruction injection via inbox file
URL: https://github.com/openai/codex/issues/16060
Evidence: explicit request for orchestrator-to-agent communication at safe points instead of hard interruption.

Gemini CLI repo truth

Repo: https://github.com/google-gemini/gemini-cli

Local clone: repo source/gemini-cli

Code/docs

packages/a2a-server/src/agent/executor.ts:450-465
Evidence: if a task already has a pending execution, the executor processes the user message in a secondary execution loop and yields back to the original execution.
packages/a2a-server/src/agent/executor.ts:521-533
Evidence: tool call requests are batched, scheduled, then the executor explicitly waits for pending tools before continuing.
packages/a2a-server/src/agent/task.ts:189-219
Evidence: pending tool calls are registered, resolved, and awaited with explicit counts.
packages/a2a-server/src/agent/task.ts:755-764
Evidence: tool calls are pre-registered before async scheduling so pending work is visible immediately.
packages/a2a-server/src/agent/task.ts:1141-1158
Evidence: pure confirmation turns can keep the task in a working state while background activity continues.
docs/reference/keyboard-shortcuts.md:106 and 220-221
Evidence: plan mode is skipped when the agent is busy.
docs/reference/commands.md:98-103
Evidence: /compress is a first-class command that replaces chat context with a summary.
docs/reference/configuration.md:410-415
Evidence: model.compressionThreshold controls when compression triggers.
docs/reference/configuration.md:1154-1156
Evidence: hooks.PreCompress runs before chat history compression.

Fresh GitHub activity

Issue #24071 — Let users queue a message while compression is running
URL: https://github.com/google-gemini/gemini-cli/issues/24071
Evidence: users explicitly want queued input during compression wait time.
Issue #24064
URL: https://github.com/google-gemini/gemini-cli/issues/24064
Evidence: user reports severe slowness and degraded responsiveness.
PR #24080 — feat(cli): add gemini update command
URL: https://github.com/google-gemini/gemini-cli/pull/24080
Evidence: less central to thesis, but part of the broader operator-facing maturation of the CLI.

Editorial synthesis

llm prompt -m gpt-5.4
Prompted with yesterday’s topic, archive titles, and fresh repo/GitHub evidence.
Outcome: strongest non-redundant angle was backpressure as the next CLI UX moat.