Agent Atlas

How AI Agents Work — Chapter 1: Birth of an Agent

How AI Agents Work

Chapter 1 — Birth of an Agent

A lot of people think an AI agent begins when you type a prompt. The source code tells a different story: before an agent writes a line, opens a file, or delegates a task, the software has already decided what kind of being it is allowed to become.

Ask most people when an AI agent begins and they’ll point to the obvious moment: a human types a request, presses enter, and waits for the machine to respond.

That answer feels natural because it matches what we can see. There is a prompt box. There is a blinking cursor. There is a sentence that looks like the start of a conversation. From the outside, birth appears to happen at the surface.

But once you start reading the code of real agent systems, the illusion breaks almost immediately.

The prompt is not the birth of the agent. It is just the trigger.

Before an agent can answer, the system has to decide what this thing is. Is it a session inside a channel? A specialized role in a team? A declared agent definition with an execution context? A product profile with its own permissions and startup ceremony? By the time the model sees its first token, those decisions have already shaped the life it is about to live.

Agents are born through architecture: through the structures that decide what kind of actor a prompt is allowed to become.

The hidden moment before the first answer

That hidden moment is what this chapter is about.

In the systems we’ve studied, the beginning of an agent is never just “model + text.” It is always some combination of identity, scope, context, permissions, and route back to the human. Different repos choose different first commitments, and those choices ripple through everything that follows.

That is why two agent systems can both feel competent and still feel profoundly different in use. They were born differently.

OpenClaw
Birth begins with routing: a message becomes a session key, a queue lane, a run, and a workspace.
Codex
Birth begins with role and orchestration: the agent enters the world already shaped by a collaborative graph.
Gemini CLI
Birth begins with definitions and execution context: a declared agent becomes live inside a prepared runtime.
Mistral Vibe
Birth begins with bootstrap and profile: onboarding, config, startup mode, and explicit agent profiles decide the opening posture.

OpenClaw: an agent is born as a session

OpenClaw is a good place to start because it makes the mechanics unusually visible. In its architecture, an incoming message does not simply flow into a model. It first becomes part of a routing system.

The docs and runtime surfaces show that the system resolves a session key based on where the message came from and what kind of run it represents. A direct chat, a group channel, a cron job, and a webhook do not all produce the same kind of birth. They create different session identities, and those identities determine queueing, serialization, and execution context.

That means an OpenClaw agent is born less like a chatbot and more like an operational process. It has a location in the system. It has a scope. It has a lane.

This is a subtle but important shift. In ordinary chatbot thinking, the model is the center. In OpenClaw’s design, the session is at least as important. The architecture is saying: before we ask what the agent thinks, we need to know where this run lives, what boundaries it belongs to, and how it should be serialized against the rest of the world.

Codex: an agent is born into a team

Codex reveals a different assumption. Its recent evolution points toward role systems, thread inheritance, and sub-agent coordination as first-class architecture. The commits tell a story of roles moving into TOML, role metadata becoming richer, and spawned agents retaining pieces of their parent configuration and session context.

That suggests that a Codex agent is often born not as an isolated actor, but as a specialized worker in an already structured environment. Its identity comes bundled with role, relationship, and constraints. A child agent is meaningful because there is already a parent thread, a broader task, and an orchestration fabric around it.

This changes the emotional picture of agency. OpenClaw’s birth feels infrastructural: a session comes to life inside a routed system. Codex’s birth feels organizational: a worker takes shape inside a team.

It is a reminder that “agent” does not always mean “one artificial person.” Sometimes it means “a temporary specialist invoked by a larger machine of cooperation.”

Gemini CLI: an agent is born from a definition

Gemini CLI’s evidence points somewhere else again. Here, the important early moves are declarative agent definitions, startup context generation, session context binding, and execution-scoped plumbing. The architecture puts a lot of emphasis on deciding what an agent is before it starts acting.

That makes the birth feel almost theatrical: first define the role, then prepare the stage, then let the actor walk on.

The significance of this approach is clarity. If agent identity is declared, configured, and bound to a context object, then the system can reason about it more cleanly. Discovery flows make sense. Enable/disable behavior makes sense. Policy attachment makes sense. Even subagents inherit a kind of conceptual dignity because they emerge from a definitional system rather than pure improvisation.

In Gemini CLI, birth is less about routing or collaboration and more about instantiation. A design becomes a running entity.

Mistral Vibe: an agent is born through product ritual

Mistral Vibe adds a fourth variation, and it matters because it brings the product layer into view. In its source, startup is explicit: bootstrap config files, run onboarding if needed, choose an initial agent profile, resume or preload session state, then enter either programmatic mode or the interactive TUI.

Here, the beginning of an agent is not just technical plumbing. It is also a user-facing ritual.

That matters because millions of real interactions with agents do not begin in abstract architectural diagrams. They begin in a terminal, in a folder, with a first-run experience, a trust decision, a selected mode, and a visible interface that tells the human what kind of collaboration is about to happen.

Mistral Vibe makes that opening posture legible. The built-in profiles are not decorative presets. They are identity templates. They determine whether the system enters the world cautious, read-only, editing-forward, or aggressively auto-approved. In that sense, Mistral Vibe shows that an agent can be born not only through system architecture, but through a designed product ceremony that teaches both the software and the user how to relate to one another.

Why this matters

All four systems are solving the same first problem: how to turn an incoming task into a bounded actor. But they solve it with different first commitments.

And those commitments matter because beginnings are not neutral. If a system starts by defining a session, it will think differently about routing and history. If it starts by defining a role, it will think differently about specialization and hierarchy. If it starts with a definition object, it will care about governance and configuration. If it starts through product bootstrap, it will care about modes, permissions, and trust at the moment of first contact.

The birth of an agent is where a philosophy becomes software.

The deeper reveal

This is the point where the story begins to widen.

Once you see that agents are born through architecture, a bigger question appears: what exactly is architecture deciding at that moment? Not just who the agent is, but what world it is about to perceive.

Because identity alone is not enough. The next thing that matters is what enters the agent’s field of view: files, logs, branches, skills, tools, memory, chat history, trusted documents, subagents, system prompts, project context. That act of curation may matter as much as the model itself.

If birth decides what kind of actor an agent may become, context decides what kind of world it is allowed to know.

That is where the story goes next.