Daily Edition Sources +5

Gemini Lets the Model Schedule Parallel Tools. Codex Makes the Runtime Decide.

Terminal agents are learning the same trick — run more tools at once — but Gemini CLI and OpenAI Codex disagree on who should be in charge of that decision. One pushes dependency control up into the prompt and tool schema. The other keeps it down in runtime metadata and locks.

repo openai/codex main
5 source signals 2 repos source trail
> source trail / March 19, 2026 / Daily Edition
TheGitReporter · Daily code story

Published 2026-03-19 · Grounded in Gemini CLI PR #21933, plus current source in scheduler.ts, tools.ts, snippets.ts, and Codex runtime code in parallel.rs, router.rs, and registry.rs.

The short version: Gemini now tells the model that tools run in parallel by default and gives it an explicit wait_for_previous flag to create dependency barriers inside a turn. Codex, by contrast, tags tools with supports_parallel_tool_calls and enforces that policy with a turn-scoped RwLock at execution time.

Same goal. Different control plane. Gemini is betting on planning-time intelligence. Codex is betting on runtime governance.

Gemini’s move: make parallelism something the model can ask for

The sharpest evidence lives in Gemini CLI’s March 12 PR, model-driven parallel tool scheduler. The diff does not just speed things up. It changes where the decision lives.

In packages/core/src/tools/tools.ts, Gemini injects a new wait_for_previous parameter into tool schemas. The description is unusually direct: set it to true when a tool depends on previous work in the same turn, otherwise omit it or set it false to run in parallel.

Then in packages/core/src/scheduler/scheduler.ts, the scheduler checks exactly that field. If the flag is absent, the call is treated as parallelizable by default. If the first queued tool is parallelizable, Gemini batches contiguous parallelizable calls into the same execution wave.

This is the important twist: Gemini is no longer just allowing parallelism. It is exposing a scheduling lever to the model itself and teaching the model when to pull it.

The prompt layer confirms that reading. In packages/core/src/prompts/snippets.ts, Gemini tells the model that tools execute in parallel by default, encourages independent searches, reads, shell commands, and edits to different files to happen together, and explicitly says dependent calls must set wait_for_previous=true.

The tests show the intended behavior is not theoretical. In scheduler_parallel.test.ts, Gemini verifies wave patterns such as a write running first, followed by multiple reads in parallel, and even checks that non-read-only tools can still run concurrently if they are not marked as dependent.

Codex’s move: make parallelism something the runtime permits

Codex reaches for the same performance win, but from the other direction. Its source code says, in effect: the model may request tools, but the runtime decides which ones may truly overlap.

In codex-rs/core/src/tools/registry.rs, each configured tool spec carries a boolean: supports_parallel_tool_calls. In router.rs, the runtime checks that metadata through tool_supports_parallel(). That means concurrency permission is declared as part of the tool definition, not as a per-call hint from the model.

The real tell is codex-rs/core/src/tools/parallel.rs. Codex’s ToolCallRuntime owns a turn-scoped parallel_execution: Arc<RwLock<()>>. If a tool supports parallel execution, it takes a read lock. If it does not, it takes a write lock. That is a simple but powerful runtime contract: safe tools may overlap; exclusive tools serialize the turn.

Codex has been reinforcing that runtime-centered design. A March 12 commit reused the same turn-scoped ToolCallRuntime inside the code mode worker path, which is another way of saying the platform is centralizing concurrency control inside shared execution infrastructure rather than pushing it outward into prompt-time decisions.

Two control planes, two bets about the future of agents

That split matters more than it first appears.

  • Gemini’s bet: the model can plan dependency structure well enough that exposing an explicit barrier flag will produce faster, more efficient multi-tool turns.
  • Codex’s bet: concurrency is too important to leave mostly to generation-time judgment, so the runtime should gate overlap according to declared tool capabilities.

One is more expressive. One is more opinionated. One gives the model a scheduling vocabulary. The other gives the platform a stronger hand on the wheel.

Why this is bigger than a speed optimization

It is tempting to read all of this as latency tuning. That would undersell it. Where concurrency policy lives affects failure modes, extensibility, and trust in the system.

If the model can label dependencies directly, you get a flexible planner that can adapt tool waves to the task at hand. But you also rely more on prompt quality, tool schema quality, and the model’s judgment about side effects.

If the runtime enforces concurrency from tool metadata and locks, you get a sturdier default posture. But you also risk a coarser system — one that may leave performance on the table unless tool authors do a good job declaring what is truly safe to overlap.

That is the real story: not faster agents, but where the responsibility for safe speed sits.

Crisp takeaway: Gemini treats parallel tool use as part of model planning. Codex treats it as a runtime safety property. They are converging on the same user-facing behavior, but they are hardening different layers of the stack.

As terminal agents grow more autonomous, which control plane will age better: a model that can mark dependency barriers on the fly, or a runtime that enforces concurrency rules whether the model asked for them or not?

If you’re building agent tooling, that’s the design fork worth watching — and probably worth arguing about in public.

Letters & Corrections

Send a note to the desk

Corrections, missing context, or a follow-up lead.