Agent Tool Menus Are Becoming Runtime Infrastructure

The next trust problem in agent software may look like a menu.

A user sees a plugin suggestion, a tool toggle, a model asking for human input, a browser tool marked available, or an MCP server discovered at startup. Each one looks like interface chrome. But for builders, operators, and teams evaluating agent tools, the practical question is sharper: does the visible tool list match a real, relevant, non-blocking capability in the runtime?

Yesterday's edition followed control settings becoming session state. Today's evidence moves one layer outward. It is not only the settings that need to survive the work. The tool menu itself has to become truthful.

Codex narrows what the model can suggest

OpenAI Codex commit 8e5f5616 changes the plugin install suggestion path. The commit explains that list_available_plugins_to_install controls which plugins the model can trigger through request_plugin_install.

The important movement is from a broad allowlist toward a relevance filter. Codex keeps a starter fallback set for users with no installed plugins, allows candidates from trusted marketplaces, and then requires most marketplace candidates to share app connector IDs with plugins already installed by the user. It also keeps explicit configured discoverables as an override while omitting installed, disabled, and unavailable plugins.

That is a tool menu acting like infrastructure. A model-facing install suggestion is not just a helpful recommendation. It is a path that can ask a user to install something. Codex is adding code so that path is bounded by marketplace trust, installed app context, and availability state.

Human input becomes a registered capability

A second Codex commit, e0435afb, adds a config switch for the experimental request_user_input tool. The patch adds schema/config support for tools.experimental_request_user_input.enabled and changes tool planning so RequestUserInputHandler is registered only when that config allows it.

That is a small change with a useful signal. Asking the user a question is not only a conversational behavior. In an agent runtime, it is a tool with modes, clients, expectations, and product consequences. If a project can turn that capability on or off before the tool is registered, the menu is no longer a static list of everything the codebase knows how to do. It is a negotiated surface for this run.

Hermes moves discovery off the startup choke point

NousResearch Hermes Agent shows the same pressure from the other side: startup. Commit 0c6e133c is titled around stopping eager MCP discovery from blocking agent-capable startup.

The patch adds a shared MCP startup helper, a cheap probe for configured MCP servers, one background discovery thread per process, and a bounded wait before the first tool snapshot. It also teaches the CLI startup path to skip inline MCP discovery for TUI chat launches and entrypoints that already have dedicated MCP startup paths.

The consequence is plain: a tool discovery system can become part of perceived reliability. If discovering tools freezes the beginning of an agent session, the menu is no longer passive metadata. It is on the critical path. Hermes is moving that work into a background and bounded shape so the agent can become usable without pretending the tool snapshot is irrelevant.

Availability checks stop refreshing tokens by surprise

Hermes commit 6a72af04 tightens a different edge. The title says tool availability scans should stay off the Nous token-refresh path.

The patch adds a cheap peek_nous_access_token helper that reads an explicit env token or cached token without triggering refresh. It then changes managed-gateway readiness checks and provider availability paths so read-only scans can use the cheap probe, while real request/session paths still use refresh-aware token resolution.

That distinction matters for operators. A status paint, tool list, or availability scan should not quietly become a synchronous OAuth refresh just because the UI wants to know whether a tool can appear. Hermes is separating "is this likely present enough to show?" from "refresh credentials before making a real request."

Setup has to write the config the menu promises

A final Hermes commit, aa32edca, shows the setup version of the same problem. The commit says apply_nous_managed_defaults() was marking image and video generation tools as changed without writing the config values those tools needed.

That meant a tool could end up in a platform toolset list without the corresponding provider or gateway settings. The fix writes provider/use_gateway config values before marking the tools changed, matching the pattern used by other managed tools.

This is the quietest evidence in the set, and maybe the most practical. A tool marked enabled but missing runtime config is worse than a missing tool. It teaches the user, and the agent, to trust a menu that cannot keep its promise.

The menu is becoming a contract

These commits do not prove a shared standard. They do not prove Codex and Hermes are coordinating. They do not prove tool discovery is solved.

They do show the same engineering pressure from multiple sides. Suggestions need relevance filters. Human-input tools need explicit registration. MCP discovery needs startup discipline. Availability checks need to avoid surprise auth work. Setup defaults need to write the config implied by the selected toolset.

That is why the humble tool menu is becoming runtime infrastructure. In an agent system, the list of tools is not just what the product can advertise. It is what the model may ask for, what the user may approve, what startup may wait on, what auth may touch, and what setup has actually made true.

The next thing to watch is whether agent tools make this contract visible to the people supervising them: why a tool is shown, why it is hidden, what it is waiting on, and whether "available" means ready to use or merely present in a catalog.

Agent Tool Menus Are Becoming Runtime Infrastructure

Codex narrows what the model can suggest

Human input becomes a registered capability

Hermes moves discovery off the startup choke point

Availability checks stop refreshing tokens by surprise

Setup has to write the config the menu promises

The menu is becoming a contract

Receipts below the story

Primary Evidence

Evidence Limits

Send a note to the desk

Agent Tool Menus Are Becoming Runtime Infrastructure

Codex narrows what the model can suggest

Human input becomes a registered capability

Hermes moves discovery off the startup choke point

Availability checks stop refreshing tokens by surprise

Setup has to write the config the menu promises

The menu is becoming a contract

Receipts below the story

Primary Evidence

Evidence Limits

Atlas Context

Tool Calls: Contract, Authorization, Execution

The Future Has Three Branches

Send a note to the desk