Daily Edition Sources +5

Agent Tool Menus Are Becoming Runtime Infrastructure

Fresh Codex and Hermes commits show a quieter agent shift: the visible list of tools is becoming a runtime contract about what can be suggested, enabled, discovered, and checked without breaking the session.

Rough zine-style diagram showing Codex and Hermes evidence cards feeding into a circled availability contract conclusion, with a caveat stamp that the commits are independent and not a shared standard.
Diagram Punkthe tool menu is becoming part of the runtime contract.
repos openai/codex + NousResearch/hermes-agent evidence
5 source signals 2 repos 5 linked commits
Evidence: 5 linked commits / May 31, 2026 / Daily Edition
Open Edition Evidence below

The next trust problem in agent software may look like a menu.

A user sees a plugin suggestion, a tool toggle, a model asking for human input, a browser tool marked available, or an MCP server discovered at startup. Each one looks like interface chrome. But for builders, operators, and teams evaluating agent tools, the practical question is sharper: does the visible tool list match a real, relevant, non-blocking capability in the runtime?

Yesterday's edition followed control settings becoming session state. Today's evidence moves one layer outward. It is not only the settings that need to survive the work. The tool menu itself has to become truthful.

Codex narrows what the model can suggest

OpenAI Codex commit 8e5f5616 changes the plugin install suggestion path. The commit explains that list_available_plugins_to_install controls which plugins the model can trigger through request_plugin_install.

The important movement is from a broad allowlist toward a relevance filter. Codex keeps a starter fallback set for users with no installed plugins, allows candidates from trusted marketplaces, and then requires most marketplace candidates to share app connector IDs with plugins already installed by the user. It also keeps explicit configured discoverables as an override while omitting installed, disabled, and unavailable plugins.

That is a tool menu acting like infrastructure. A model-facing install suggestion is not just a helpful recommendation. It is a path that can ask a user to install something. Codex is adding code so that path is bounded by marketplace trust, installed app context, and availability state.

Human input becomes a registered capability

A second Codex commit, e0435afb, adds a config switch for the experimental request_user_input tool. The patch adds schema/config support for tools.experimental_request_user_input.enabled and changes tool planning so RequestUserInputHandler is registered only when that config allows it.

That is a small change with a useful signal. Asking the user a question is not only a conversational behavior. In an agent runtime, it is a tool with modes, clients, expectations, and product consequences. If a project can turn that capability on or off before the tool is registered, the menu is no longer a static list of everything the codebase knows how to do. It is a negotiated surface for this run.

Hermes moves discovery off the startup choke point

NousResearch Hermes Agent shows the same pressure from the other side: startup. Commit 0c6e133c is titled around stopping eager MCP discovery from blocking agent-capable startup.

The patch adds a shared MCP startup helper, a cheap probe for configured MCP servers, one background discovery thread per process, and a bounded wait before the first tool snapshot. It also teaches the CLI startup path to skip inline MCP discovery for TUI chat launches and entrypoints that already have dedicated MCP startup paths.

The consequence is plain: a tool discovery system can become part of perceived reliability. If discovering tools freezes the beginning of an agent session, the menu is no longer passive metadata. It is on the critical path. Hermes is moving that work into a background and bounded shape so the agent can become usable without pretending the tool snapshot is irrelevant.

Availability checks stop refreshing tokens by surprise

Hermes commit 6a72af04 tightens a different edge. The title says tool availability scans should stay off the Nous token-refresh path.

The patch adds a cheap peek_nous_access_token helper that reads an explicit env token or cached token without triggering refresh. It then changes managed-gateway readiness checks and provider availability paths so read-only scans can use the cheap probe, while real request/session paths still use refresh-aware token resolution.

That distinction matters for operators. A status paint, tool list, or availability scan should not quietly become a synchronous OAuth refresh just because the UI wants to know whether a tool can appear. Hermes is separating "is this likely present enough to show?" from "refresh credentials before making a real request."

Setup has to write the config the menu promises

A final Hermes commit, aa32edca, shows the setup version of the same problem. The commit says apply_nous_managed_defaults() was marking image and video generation tools as changed without writing the config values those tools needed.

That meant a tool could end up in a platform toolset list without the corresponding provider or gateway settings. The fix writes provider/use_gateway config values before marking the tools changed, matching the pattern used by other managed tools.

This is the quietest evidence in the set, and maybe the most practical. A tool marked enabled but missing runtime config is worse than a missing tool. It teaches the user, and the agent, to trust a menu that cannot keep its promise.

The menu is becoming a contract

These commits do not prove a shared standard. They do not prove Codex and Hermes are coordinating. They do not prove tool discovery is solved.

They do show the same engineering pressure from multiple sides. Suggestions need relevance filters. Human-input tools need explicit registration. MCP discovery needs startup discipline. Availability checks need to avoid surprise auth work. Setup defaults need to write the config implied by the selected toolset.

That is why the humble tool menu is becoming runtime infrastructure. In an agent system, the list of tools is not just what the product can advertise. It is what the model may ask for, what the user may approve, what startup may wait on, what auth may touch, and what setup has actually made true.

The next thing to watch is whether agent tools make this contract visible to the people supervising them: why a tool is shown, why it is hidden, what it is waiting on, and whether "available" means ready to use or merely present in a catalog.

Evidence Trail

Receipts below the story

The article above is the public narrative. This section keeps the source trail, limits, and reporting notes on the same page.

Edition
DateMay 31, 2026
LaneDaily Edition
Confidence78%
Sources5
Reposopenai/codex, NousResearch/hermes-agent

Reporter Notes

The strongest story is not "more tools." The stronger claim is that the tool surface is being made less naive:

  • Codex is narrowing plugin install suggestions so the model can only ask the user about candidates that are either fallback starters, explicitly configured, or relevant to installed app connector IDs.
  • Codex is making request_user_input registration depend on config, which treats a human-input tool as an intentional runtime capability.
  • Hermes is moving MCP discovery away from blocking startup paths and adding a bounded wait before the first tool snapshot.
  • Hermes is splitting cheap availability probes from refresh-aware request/session paths for managed gateway tools.
  • Hermes is fixing setup defaults so image/video tools are not marked enabled without the config needed for runtime availability.

Primary Evidence

Evidence Limits

  • These sources do not prove Codex and Hermes are coordinating on one shared tool-availability standard.
  • They do not prove every listed tool is secure, complete, or available in every deployment.
  • The Codex evidence is strongest for plugin install suggestion filtering and configurable registration of request_user_input.
  • The Hermes evidence is strongest for MCP startup behavior, managed-gateway availability checks, and setup defaults for image/video tools.
  • The story supports a pattern of tool availability becoming runtime infrastructure, not a claim that any project has solved tool discovery permanently.
Letters & Corrections

Send a note to the desk

Corrections, missing context, or a follow-up lead.