The advisor's guidance is clear: apollo-save stays (it's a self-describing command derived from Apollo, which is explicitly kept), and the only real edit is repairing the broken grammar artifact where a name was mechanically swapped to "I," leaving "tell I something." The advisor recommends "learn" as the most natural fit for the surrounding logic. Everything else is clean.


TLDR: Don't save every turn. Don't wait to be asked. Separate the cheap decision ("is this worth keeping?") from the expensive write ("format it, dedup it, reindex it"). Stage fast, flush later.

The Setup

Apollo (my personal AI agent, running as a CLI subprocess invoked per iMessage) is stateless by design.

Every inbound message spawns a fresh claude process — clean isolation, no long-lived session. Which means: no memory coming in, unless you explicitly load it.

I built apollo-save to capture learnings at the end of a session. Works great. Except I kept forgetting to run it.

Or the session would die mid-flow.

Either way, something real gets learned in a conversation… and then it's gone.

What I Tried First (and Why It Was Wrong)

On June 15th I asked Apollo to just save memory at the end of every turn. Automatic. No more forgetting.

Made sense in my head.

What actually happened: Apollo ran the full apollo-save pipeline on every single turn — re-auditing the whole conversation, checking for duplicates, writing files, running the RAG reindex on the Obsidian-vault-backed memory corpus.

The results were bad.

memory/ flooded with slop. MEMORY.md — which has an 80-line cap before it gets unmanageable — blew past it in days. Worse, it captured half-formed conclusions that later turns would revise, so the memory was actively misleading future sessions.

Heavyweight per turn. Biased toward save-everything. Wrong.

The Fix That Actually Worked

The real problem wasn't when to save.

It was that I was conflating two completely different things:

  1. The decision — did this turn produce something worth keeping?
  2. The write — format it, dedup it, integrate it, reindex it.

Those should not happen together every turn.

Here's what I built instead:

  • At the end of every turn, Apollo makes a lightweight judgment call: did I just learn something I didn't know coming in?
  • If yes — one line gets appended to memory/_inbox.md. No dedup check. No reindex. No file lookup. Just a cheap append.
  • If no — total no-op. Most turns are a no-op. That's the point.
  • When apollo-save runs (manually, or on a trigger) — it flushes the inbox into real memory files, deduplicates, updates MEMORY.md, and does the reindex properly.

The judgment heuristic I settled on: synthesis = save trigger.

If Apollo ran a tool call that surfaced a person, a decision, a deadline, or a preference the agent didn't have walking in — that's a synthesis act. Write the _inbox.md line before composing the reply, not after. The learning is freshest right at that moment.

Why This Matters

If you're building stateless agents — and most agents are stateless, whether you think about it that way or not — you need to solve this problem.

Waiting for the user to say "remember this" means you lose 80% of the signal.

Saving everything on every turn means you drown in noise and the memory stops being trustworthy.

The answer is autonomous persist-decision with deferred write: a cheap per-turn judgment that commits a stage, and a heavier pipeline that runs on a real flush. Cheap decision, deferred cost.

The inbox as a staging buffer is the unlock — it decouples "I noticed something" from "I processed something."

That distinction is the whole thing.

P.S. Yes, I also hit a concurrency bug — the manual apollo-save raced with the iMessage daemon's auto-save and tripped the Edit tool's snapshot guard on MEMORY.md. Fixed with an atomic Python read-modify-write anchored on a known string. Different problem, same lesson: treat memory as shared state, not a file you own.