Tool Calls: Contract, Authorization, Execution

Case file 03 / A pipe changes the decision

Why git log | head is not merely git log

Crush commit 96728b15 adds containsCommandChaining to inspect shell metacharacters before granting the bash tool's safe-read-only shortcut. A plain git log may qualify; git log piped into head does not, because the pipe moves the request into shell composition.

The bash contract can accept both command strings. Authorization is decided later: the runtime checks the concrete command, session, working path, and permission state before execution. A schema says what can be proposed. It does not say what may happen.

Latest newsroom receipts

What changed since the first Atlas draft

anomalyco/opencode OpenCode Turns OpenAPI Into Agent Tool Contracts

OpenAPI specs become a model-visible tool surface with credentials and unsupported operations kept outside reach.

anomalyco/opencode Shoubhit Dash Made OpenCode Read MCP Resources

MCP resources add another concrete surface where tool discovery and reviewability matter.

anomalyco/opencode, openclaw/openclaw, NousResearch/hermes-agent, openai/codex Agent Tools Are Getting Credential Boundaries

Credential handling keeps reminding readers that tool access is not just schema availability.

Mechanism trace

What actually happens

Propose call
name plus arguments
Validate
schema and registration
Check policy
grants and limits
Operator decision
only when required
Authorize
policy + any consent
Bound executor
selected principal
Environment effect
commit or reject
Record receipt
result + evidence

One call crosses separate control planes

The tool contract contains the model-visible name, description, input schema, and expected result shape. It constrains how a proposal is expressed, but it does not establish whether credentials exist, whether the current principal may act, or whether the requested arguments are safe.

Registration, authorization, and execution answer different questions. Registration asks whether the tool enters the plan. Authorization asks whether this invocation may proceed. Execution determines which process, server, operating-system user, network identity, or external credential performs the side effect.

Availability is negotiated at runtime

Codex commit e0435afb registers RequestUserInputHandler only when resolved configuration permits it. Tests verify that disabling the setting removes the tool from both visible and registered sets.

Availability still does not mean readiness or authority. Hermes moves MCP discovery into a background helper with a bounded first-snapshot wait and keeps read-only scans away from synchronous token refresh. A tool menu is a live runtime contract, not a static feature list.

Policy must run before side effects

Codex commit 5c20513a makes ordinary local function tools participate in default PreToolUse and PostToolUse hooks. PreToolUse can block a call or rewrite its input before the handler runs. Runtime-control operations explicitly opt out of selected generic behavior.

Policy state must remain coherent under concurrency. Crush commit 6b312bee moves skip state to an atomic boolean and protects session auto-approval with a mutex. A correct schema cannot compensate for a racing permission service.

Receipts observe outcomes without owning tools

Codex commit c69cde35 adds ToolLifecycleContributor. It observes accepted starts and typed finishes with turn ID, call ID, tool name, invocation source, and outcomes including completed, blocked, failed, and aborted.

Those receipts help attribute activity, but they do not prove that authorization was appropriate, returned data was correct, or an external effect can be reversed. Observation, policy, and execution are separate responsibilities.

Failure modes

Where the contract breaks

Contract-authorization collapse

A registered schema is treated as permission to perform every valid invocation.

Failure signalThe tool executes without a distinct policy or authorization decision.

Stale availability

Discovery advertises a tool whose credentials, server, or provider are not ready.

Failure signalThe planned tool blocks startup or fails immediately.

Receipt without effect integrity

A normal handler return is recorded as success despite partial or incorrect effects.

Failure signalCompleted lifecycle events disagree with external state.

Release tests

What the product must prove

Identify the specification source, registration gate, policy hook, approval path, executor principal, result sanitizer, and audit sink for one tool.
Run identical arguments under different permission and sandbox states; record which layer changes the decision.
Compare lifecycle outcomes with the external side effect and available compensation mechanism.