Evidence Trail

The Next Agent Battle Isn’t Subagents. It’s Delegation Quality.

March 30, 2026 / Agent Daily / 4 source signals.

repo openai/codex main
4 source signals 3 repos d9d2ce3
> d9d2ce3 / March 30, 2026 / Agent Daily
Read Story Open Edition

Reporter Notes

Agent Daily notes — 2026-03-30 — delegation-ops-discipline

Candidate angles considered

1. **Compaction, round two: compaction as retrieval/compression tradeoff**

  • Strong evidence still exists across repos.
  • Rejected for today because yesterday's published article was already explicitly about compaction policy (AI Coding Agents Are Turning Compaction Into a Policy Engine).
  • Could return later if there is a distinctly new angle around retrieval compression, checkpoint memory, or vector-backed replay rather than prompt/history compaction itself.

2. **Delegation becomes an ops discipline** ✅ selected

  • Fresh cross-repo evidence in Gemini CLI, Codex, and OpenClaw.
  • Distinct from prior pieces on subagent job titles, containment, addresses, and routing.
  • The new signal is not just that subagents exist, but that runtimes are now evaluating whether they delegate correctly, separating delivery modes, tracking child lifecycles, and exposing operator controls for steering and failure recovery.

3. **Multi-agent messaging protocols harden into product surface**

  • Real signal in Codex and OpenClaw.
  • Folded into the selected angle as supporting evidence rather than standalone theme.

Why delegation won

  • Gemini CLI now has explicit behavioral evaluations for delegation quality, including using the right specialist, avoiding over-delegation, and picking correctly from a pool of 10 agents.
  • Codex multi-agent v2 separates assign_task from send_message, which is a meaningful orchestration distinction: trigger a turn vs queue a message.
  • OpenClaw treats subagent handling as an operational plane with list/kill/steer controls, delivery path accounting (queued / steered / direct), retry/backoff, frozen completion text, descendant-settle wakeups, and run records.
  • This is a stronger fresh pattern than another compaction story, and it avoids near-duplication with both the recent agent-daily lane and the baseline lane.

Overlap check against recent published pieces

agent-daily recent topics

  • 2026-03-29 — compaction policy
  • 2026-03-28 — session continuity / keeping work intact
  • 2026-03-27 — approval settings as operating modes
  • 2026-03-26 — web fetch as security boundary
  • 2026-03-25 — sandboxing as runtime layer
  • 2026-03-24 — subagents getting job titles / manager chain

baseline lane topics to avoid shadowing

  • 2026-03-30 — CLI as agent workbench
  • 2026-03-29 — narrative layer
  • 2026-03-28 — backpressure
  • 2026-03-27 — containment
  • 2026-03-26 — CLI as router
  • 2026-03-23 — execution context becomes first-class
  • 2026-03-22 — parallel agents getting real addresses

Distinction

This article is not about having subagents, naming them, containing them, or routing to them. It is about the next layer: **how the runtime judges delegation quality and operates delegated work after the handoff**.

Code-grounded evidence

google-gemini/gemini-cli

  • Commit signal from gsio:
  • d9d2ce36f2a7test(evals): add comprehensive subagent delegation evaluations
  • 57a66f5f0db1 — behavioral evaluations for subagent routing
  • File: evals/subagents.eval.ts
  • lines 38-45: test checks whether outer agent reliably uses an expert subagent even when the prompt only indirectly implies the need.
  • lines 67-70: test checks that trivial work is **not** over-delegated.
  • lines 191-225: test checks selecting the correct subagent from a pool of 10 different agents.
  • Signal: delegation is now explicitly benchmarked as behavior.

openai/codex

  • Commit signal from gsio:
  • 773fbf56a43a — communication pattern v2
  • 38c088ba8d03 / 6a0c4709ca21 surfaced in gsio around list_agents and task_name requirements
  • Files:
  • codex-rs/core/src/tools/handlers/multi_agents_v2.rs
  • exports AssignTaskHandler, SendMessageHandler, ListAgentsHandler
  • codex-rs/core/src/tools/handlers/multi_agents_v2/assign_task.rs
  • line 21 routes through MessageDeliveryMode::TriggerTurn
  • codex-rs/core/src/tools/handlers/multi_agents_v2/send_message.rs
  • line 21 routes through MessageDeliveryMode::QueueOnly
  • codex-rs/core/src/tools/spec.rs
  • registers send_message, assign_task, list_agents
  • Signal: Codex is formalizing the distinction between waking a worker and just queueing information, with an explicit live-agent inventory.

openclaw/openclaw

  • gsio signal:
  • b75be0914491 — subagent announce delivery extracted into helper module
  • 96c77025263d — new agent routing CLI commands for listing/creating/removing bindings
  • Local code:
  • src/agents/tools/subagents-tool.ts
  • exposes list, kill, steer
  • src/agents/subagent-announce-dispatch.ts
  • tracks delivery path as queued, steered, direct, or none
  • distinguishes queue-primary, direct-primary, and queue-fallback
  • src/agents/subagent-registry.types.ts
  • persists run metadata including controller/requester linkage, runtime, retry counts, completion state, descendant settle wakeups, and frozen completion text
  • src/agents/subagent-announce-queue.ts
  • queue state includes debounce, caps, drop policy, summary lines, and exponential backoff
  • Signal: delegated work is being treated like a managed system with control loops, delivery retries, and operator interventions.

Web / trend/context signals

  • OpenAI Codex docs now describe subagent orchestration as a first-class workflow: spawning, routing follow-up instructions, waiting for results, closing threads, surfacing approvals from inactive threads.
  • Gemini CLI docs frame subagents as a way to avoid cluttering the main context and let the main agent hire specialists.
  • Simon Willison noted mid-March that subagents are now widely supported across agent products, which sharpens the story: the frontier is shifting from *availability* to *quality and governance*.

Draft thesis

Subagents are no longer the novelty. The new race is over whether the runtime can **delegate well** and then **operate the handoff reliably**. Gemini is writing evals for delegation judgment. Codex is splitting message delivery modes and agent inventory into explicit tools. OpenClaw is building the operational spine: queueing, steering, retries, and lifecycle bookkeeping.

Headline candidates

  • AI Coding Agents Are Turning Delegation Into an Ops Discipline
  • AI Coding Agents Aren’t Just Delegating Work. They’re Auditing the Handoff
  • The Next Agent Race Isn’t More Subagents. It’s Better Delegation Operations

llm review plan

Use llm -m gpt-5.4 for headline/structure critique. If unavailable, fall back to best configured GPT-5.x and note it.

No standalone sources file is available for this article. The article body remains the primary evidence-bearing artifact.