AI Agent Summaries Are Becoming Infrastructure
March 31, 2026 / Agent Daily / 1 source signal.
Reporter Notes
2026-03-31 — Managed summaries
Candidate angles evaluated
1. **Compaction as a primary operating surface**
- Evidence was strong again across Gemini, OpenClaw, and Codex.
- Rejected for publication today because it would overlap too closely with recent pieces:
2026-03-29in this lane: **AI Coding Agents Are Turning Compaction Into a Policy Engine**- baseline
2026-03-31: **The Real Agent Feature Is Not Losing the Plot** - Conclusion: compaction remains editorially important, but the fresher story is the layer forming on top of it.
2. **Summaries are becoming managed runtime artifacts** ✅ selected
- Freshest cross-repo pattern.
- Summary state is no longer just a blob of text dumped back into context.
- Repos now give summaries IDs, retention policies, middleware triggers, explicit files, and dedicated maintenance agents/jobs.
3. **Memory maintenance is splitting into its own subsystem**
- Strong sub-angle, especially in Gemini and Codex.
- Folded into angle #2 because it is the clearest supporting pattern rather than the whole story.
Why angle #2 won
It is adjacent to compaction without repeating the same article.
The new signal is not merely “agents compress context.”
It is that **summaries are being operationalized**:
- persisted,
- resumed from,
- policy-driven,
- surfaced in UI,
- and maintained by dedicated code paths.
Cross-repo evidence
1) google-gemini/gemini-cli
#### A. Memory manager becomes its own local agent
Source: packages/core/src/agents/memory-manager-agent.ts
Key lines:
30-35: “A memory management agent that replaces the built-in save_memory tool.”56-106: system prompt defines memory hierarchy, routing, deduplication, organizing, and lean-memory constraints.91: “Keep GEMINI.md files lean — they are loaded into context every session.”97-100: emphasizes minimizing turns and file operations.131-153: dedicated model/tool/run config for the memory manager agent.
Interpretation:
- Gemini is moving memory upkeep out of a single tool call and into a specialized agent role.
- That is a stronger architectural statement than “we summarize things sometimes.”
#### B. Tool-output summarization is configurable infrastructure
Sources:
packages/core/src/utils/summarizer.tsdocs/reference/configuration.mddocs/cli/settings.md
Key lines from summarizer.ts:
44-56: explicit summarization prompt for tool output, preserving main points plus full error/warning traces.72-104:summarizeToolOutput()computes max output tokens, skips short outputs, otherwise uses a utility summarizer model.
Key config/docs signals:
docs/reference/configuration.md:417→model.summarizeToolOutputdocs/reference/configuration.md:423→model.compressionThresholddocs/cli/settings.md:101→ compression threshold documented as a user-facing setting.
Interpretation:
- Gemini is not just compressing chat; it is turning summarization into a configurable subsystem that governs tool-output handling and memory load.
2) charmbracelet/crush
Source commit: 3e424754b48862fdd941f5d6434abda989caaa21
Commit message: Improve summary to keep context (#159)
#### A. Summary gets a durable session identity
Files changed include:
internal/db/migrations/20250515105448_add_summary_message_id.sqlinternal/db/models.gointernal/db/sessions.sql.gointernal/session/session.go
Interpretation:
- Crush adds
summary_message_idto session storage instead of treating a summary as throwaway continuation text.
#### B. Runtime resumes from the summary checkpoint
Source: internal/llm/agent/agent.go
Key lines:
249-265: ifsession.SummaryMessageIDexists, find that message, slice history from there, and coerce that starting summary message into the resumed user-facing history.520-683: summarization flow creates a summary message inside the same session, stores its ID back on the session, updates token/cost accounting, and emits a summarize-complete event.
Interpretation:
- Crush is using summary text as a real checkpoint pointer.
- The runtime literally changes what history is replayed based on that saved summary message ID.
#### C. UI acknowledges the summary as a special object
Source: internal/tui/components/chat/message.go
172-174: renders(summary)metadata in the message view.
Interpretation:
- Once the UI labels summaries distinctly, they are no longer invisible plumbing.
3) langchain-ai/langchain
Source commit: e1adf781c66cdaa54695203a5039785d0221d46b
File: libs/langchain_v1/langchain/agents/middleware/summarization.py
Key lines/signals:
- Class docstring: summarization happens “when token limits are approached.”
triggeraccepts context thresholds such as("messages", 50),("tokens", 3000), or[("fraction", 0.8), ("messages", 100)].keepis a distinct retention policy after summarization.- Docs/reference explicitly describe preserving recent messages and keeping AI/tool message pairs together.
Interpretation:
- LangChain is formalizing summarization as middleware with explicit trigger and retention contracts.
- This is not a one-off helper; it is an attachable agent-control layer.
4) openai/codex
Source commit: 382fa338b3f1 (memory phase 2 consolidation)
File: codex-rs/core/src/memories/phase2.rs
Key signals from file body:
- “Runs memory phase 2 (aka consolidation) in strict order.”
- Queries selected memories from DB.
- Syncs
rollout_summaries/from memory artifacts. - Rebuilds
raw_memories.mdfrom memory artifacts. - Only then proceeds with the consolidation flow.
Interpretation:
- Codex treats summary-like rollups as durable artifacts in a staged pipeline, not just context fluff.
- The memory system has job claiming, ordering, file sync, and consolidation boundaries.
Web/context signals
- LangChain public reference docs now frame summarization middleware as a first-class built-in mechanism for approaching token limits while preserving recent context and tool/AI pairing.
- Gemini public config/docs expose summarization and compression knobs (
model.summarizeToolOutput,model.compressionThreshold) rather than burying them as hidden internals.
Selected thesis
**Agent products are promoting summaries from disposable prose into managed infrastructure.**
The deeper shift:
- summaries now have lifecycle,
- schema or storage,
- runtime semantics,
- UI visibility,
- and maintenance agents/jobs.
That feels like the next layer after “compaction matters.”
Repos cited
- google-gemini/gemini-cli
- charmbracelet/crush
- langchain-ai/langchain
- openai/codex
Tooling trail
- Used
gsioto surface compaction, summarization, memory-manager,summary_message_id, and middleware-related commits across repos. - Used
gh apito inspect commit/file deltas for Crush, LangChain, Codex, and Gemini. - Used
git ls-remote/ shallow clones for local inspection where helpful. - Will use
llm -m gpt-5.4for review/synthesis pass.
Sources — 2026-03-31 managed summaries
Local project inputs
agent daily corpus/_state.jsonagent daily corpus/feedback.md- Recent local published articles under
agent daily corpus/daily/ - Baseline articles under
daily edition corpus/
Repo/code evidence
google-gemini/gemini-cli
- Commit:
33f630111f97— experimental memory manager agent - Local clone/file:
projects/agent-daily/tmp-repos/gemini-cli/packages/core/src/agents/memory-manager-agent.tsprojects/agent-daily/tmp-repos/gemini-cli/packages/core/src/utils/summarizer.tsprojects/agent-daily/tmp-repos/gemini-cli/docs/reference/configuration.mdprojects/agent-daily/tmp-repos/gemini-cli/docs/cli/settings.md
charmbracelet/crush
- Commit:
3e424754b48862fdd941f5d6434abda989caaa21—Improve summary to keep context (#159) gh apicommit inspectiongh apicontent fetches for:internal/llm/agent/agent.gointernal/tui/components/chat/message.go
langchain-ai/langchain
- Commit:
e1adf781c66cdaa54695203a5039785d0221d46b gh apicommit inspection- Raw source/doc fetch:
libs/langchain_v1/langchain/agents/middleware/summarization.py- Public docs:
https://reference.langchain.com/python/langchain/agents/middleware/summarization/SummarizationMiddleware
openai/codex
- Commit:
382fa338b3f1 gh apifile listing- Raw source fetch:
codex-rs/core/src/memories/phase2.rs
Trend/context
- LangChain reference docs for
SummarizationMiddleware - Gemini CLI public configuration docs showing
model.summarizeToolOutputandmodel.compressionThreshold
Commands/tools used
gsio search q ...gh api repos/.../commits/...gh api repos/.../contents/...git ls-remote ...- local
git clone --depth ... llm -m gpt-5.4 ...for review/synthesis