Agent Daily Sources +2

AI Agents Are Turning Compaction Into State Surgery

The new problem is not just squeezing a bloated context window. It is preserving the right tail, marking what was compressed, and making the session recoverable when the cut goes wrong.

repo openai/codex main
2 source signals 3 repos c6968c3
> c6968c3 / April 1, 2026 / Agent Daily

Compaction used to sound like housekeeping.

Chat too long, tool output too noisy, token meter too red — summarize the old stuff and move on.

Nice in theory. Messy in production.

Across agent repos, compaction is evolving from “make it shorter” into a deeper runtime job: rewrite history carefully enough that the agent can still stand back up afterward.

That is a very different engineering problem.

Once you care about resumability, checkpoints, unsummarized tails, compaction markers, and replay semantics, you are no longer doing text cleanup. You are doing state surgery.

OpenClaw is literally rewriting the patient file

The clearest evidence comes from openclaw/openclaw.

On March 20, OpenClaw merged PR #41021: “feat(compaction): truncate session JSONL after compaction to prevent unbounded growth.”

The motivation in the PR is blunt: session files could grow to 7+ MB and 3,800+ lines of stale entries after repeated compaction cycles, leading to timeouts and stuck sessions.

So OpenClaw did not just improve the summary prompt. In src/agents/pi-embedded-runner/compact.ts, it now conditionally calls truncateSessionAfterCompaction after a successful compaction pass. And the new tests in session-truncation.test.ts tell the real story.

The runtime is careful to preserve the firstKeptEntryId unsummarized tail, keep post-compaction messages, support multi-cycle compaction, and even archive the original file before rewriting it.

That is not “summary generation.” That is controlled history rewrite.

The docs sharpen the picture. OpenClaw’s compaction docs say summaries persist in JSONL, while docs/concepts/session.md adds a silent pre-compaction memory flush so the system can write durable notes before compression kicks in.

In other words: compact, but first save what must not disappear.

Codex is making compaction part of resumable thread history

OpenAI’s codex repo shows the same shift from a different angle.

In codex-rs/app-server-protocol/src/protocol/common.rs, the protocol now exposes thread/compact/start. That already tells you compaction is no longer an invisible side effect. It has become an explicit thread operation.

Then look at ThreadResumeParams in codex-rs/app-server-protocol/src/protocol/v2.rs. Codex supports resuming a thread by thread_id, by full history, or by path. It also includes persist_extended_history to reconstruct richer history on later resume, fork, or read operations.

That is the giveaway. The system is not just compacting for token relief. It is compacting inside a larger replay and recovery model.

The thread-history code makes this even more explicit. In thread_history.rs, Codex inserts a ThreadItem::ContextCompaction marker and preserves compaction-only legacy turns so they are not dropped during history reconstruction.

Compaction here is treated like an event the runtime must remember, not a paragraph the runtime hopes is good enough.

And the public pain signal is already visible. A fresh issue, openai/codex#16278, reports a remote compaction timeout that leaves the session unrecoverable and makes codex resume hang. That is exactly why this design layer matters: when compaction breaks, users do not experience “slightly worse summarization.” They experience total workflow loss.

Gemini CLI is exposing the whole compression control panel — and backing off when memory goes wrong

Google’s gemini-cli is converging on the same territory, but through configuration and operator controls.

The newly generated docs/reference/configuration.md surfaces an unusually rich compression stack:

  • model.summarizeToolOutput
  • model.compressionThreshold
  • tools.truncateToolOutputThreshold
  • contextManagement.historyWindow.maxTokens
  • contextManagement.historyWindow.retainedTokens
  • contextManagement.toolDistillation.maxOutputTokens
  • contextManagement.toolDistillation.summarizationThresholdTokens
  • hooks.PreCompress

That is a lot of public surface area for what used to be treated like hidden plumbing.

But the more revealing signal is the repo’s latest rollback. In PR #24393, Gemini CLI set memoryManager to false in settings because it was causing the .gemin* directory to be loaded into main context as part of the repo.

That sounds small. It is not.

It means automatic memory handling crossed the line from “helpful compression” into “context pollution,” and the team decided the safe move was to disable it until fixed.

So Gemini is doing two things at once:

  • making compression highly configurable,
  • and acknowledging that memory capture can still misfire badly enough to poison the prompt.

That is exactly the state-surgery story in miniature: compression is powerful, but one sloppy cut can nick the wrong artery.

The deeper pattern: compaction is merging with recovery

Put those repos side by side and a sharper pattern appears.

  • OpenClaw rewrites persisted session history after compaction, while preserving the unsummarized tail and archival escape hatches.
  • Codex treats compaction as an explicit thread event inside a resumable history protocol.
  • Gemini CLI exposes token and compression controls, adds pre-compress hooks, and just backed away from an automatic memory path that was contaminating context.

Different codebases, same directional truth: compaction is no longer just about saving tokens. It is becoming inseparable from recovery, replay, and trust.

That is what makes this different from earlier waves of “smart summarization.” The hard question now is not only what should be summarized?

It is also:

  • What must survive untouched?
  • What becomes a marker in history?
  • What can be safely rewritten?
  • What has to be resumable after compression?
  • What happens when compaction itself fails?

Why this matters to users

Users do not care whether your architecture diagram says “compression subsystem.”

They care whether the agent still remembers the right file, the right warning, the right half-finished plan, and the right next step after an hour of work.

That is why compaction has quietly become a product issue.

If your agent compresses aggressively but cannot recover cleanly, it feels unreliable.

If it preserves too much, it gets expensive, slow, or brittle.

If it rewrites history without visible structure, users lose the plot.

The winning systems will probably be the ones that treat compression less like summarization magic and more like careful operating-room protocol: mark the incision, preserve vital tissue, and make recovery boringly dependable.

What to watch next

The next battleground is likely not bigger context windows alone.

It is whether agent runtimes can combine:

  • cheap compression,
  • reliable replay,
  • human-inspectable markers,
  • and policies for what may never be compacted away.

Because once compaction touches recovery, it stops being a background optimization.

It becomes part of the agent’s memory contract.

So here’s the open question: as agent systems get better at rewriting their own past, who decides which pieces of work are allowed to become summary and which must remain first-class history?

If you build agent tooling, audit your compaction path like reliability infrastructure — and make the cut legible enough that a human can still trust the story after compression.

Letters & Corrections

Send a note to the desk

Corrections, missing context, or a follow-up lead.