Agent Runtimes Are Making Their Limits Explicit

The most important agent feature this week was not a new personality or a larger context window. It was a quieter shift in the plumbing: agent runtimes are getting more explicit about what they are allowed to do, and about why they cannot continue.

That matters because coding agents now sit between human intent and real execution. They run shell commands, call tools, stream work through clients, summarize old context, and retry provider calls. If those boundaries are implicit, the user only sees surprise: a command ran, a session hung, a prompt failed, or a model limit was guessed from stale tables. If those boundaries become runtime facts, the system can ask, deny, retry, summarize, or explain.

This week's evidence comes from two different projects solving different parts of that same problem. Crush tightened the boundary around shell execution. LangChain tightened the boundary around context overflow and model capacity.

Crush makes shell composition visible

In Charmbracelet's Crush, commit 96728b15 changes the bash tool's safety test. A command can still count as safe read-only when it is a known safe command, but the new logic first checks for shell chaining and substitution characters. If the command contains a composition boundary such as a pipe, semicolon, double ampersand, command substitution, or backticks, it no longer gets the safe-read-only shortcut.

The file-level evidence is concrete. internal/agent/tools/safe.go adds a containsCommandChaining helper. internal/agent/tools/bash.go only skips a permission request when a command is both safe-listed and not chained. The tests exercise the line users actually care about: git log stays safe, while git log | head requires a prompt.

That is a useful distinction. A bare read command and a composed shell expression can look similar in a chat transcript, but they are not the same runtime act. Once a pipe or substitution enters the command, the agent has moved from a simple read into shell composition. Crush now treats that as something worth asking about.

The permission service gets hardened too

The same theme shows up again in Crush commit 6b312bee, titled as a fix for a potential data race on permissionService. The changed files are small, but pointed: internal/permission/permission.go moves skip-mode state through an atomic boolean and guards auto-approve session state with a mutex; internal/permission/permission_test.go adds a concurrent skip-mode test.

That is not a flashy user-facing feature. It is the reliability layer underneath a trust prompt. Permission systems are only useful if they remain coherent while the agent, UI, and background work are all moving. A prompt that races with skip mode or session approval is not just a bug; it weakens the promise that "allowed" and "denied" mean the same thing across the runtime.

LangChain makes context failure catchable

LangChain is tightening a different boundary: the point where an agent or application runs out of model context. Commit 40c515c7b updates the Fireworks chat integration so provider errors that say the prompt is too long are promoted to a FireworksContextOverflowError, which is also a ContextOverflowError. The accompanying tests check both sides of the contract: prompt-too-long errors are catchable as context overflow, while unrelated invalid requests are not promoted.

A neighboring OpenAI integration commit, 2259d292, broadens the OpenAI context-overflow detector to include additional provider phrasing, including "prompt is too long." Another, 992c613b, changes BaseOpenAI.modelname_to_contextsize so it prefers max_input_tokens from model profiles and deprecates the old helper path in favor of reading profile data directly.

Put those together and the practical consequence is clear: an application can respond to context exhaustion as a known condition, not as a miscellaneous provider failure. That is the difference between "the model errored" and "the runtime should summarize, trim, retry, or tell the user what limit was hit."

The pattern is the contract

Crush and LangChain are not implementing the same subsystem. One is a terminal coding agent with permission prompts around tools. The other is a framework layer that normalizes provider behavior for builders. But their changes rhyme: each turns a hidden boundary into an explicit runtime contract.

For agent builders, that is the real story. The next generation of useful agent software will not be defined only by how much it can attempt. It will be defined by how clearly it marks the edges of action: when a shell command becomes a composed command, when a permission decision is stable, when a model cannot accept more input, and where the runtime should read the limit from.

Watch next for these contracts to move higher in the product: permission histories, context-overflow recovery paths, model-profile inspectors, and UI language that tells users not just what the agent did, but which boundary shaped the decision.

Agent Runtimes Are Making Their Limits Explicit

Crush makes shell composition visible

The permission service gets hardened too

LangChain makes context failure catchable

The pattern is the contract

Receipts below the story

Primary Evidence

Evidence Limits

Send a note to the desk

Agent Runtimes Are Making Their Limits Explicit

Crush makes shell composition visible

The permission service gets hardened too

LangChain makes context failure catchable

The pattern is the contract

Receipts below the story

Primary Evidence

Evidence Limits

Atlas Context

The Approval Trap

Send a note to the desk