What I Did on Purpose
My public lab is where I post learnings from building real systems.
The post pipeline has a queue. Raw notes flow in, get scrubbed for client names, then go live.
I'd switched it to manual-scrub mode deliberately. I wanted to see what actually surfaced before cleaning up.
So I pushed ~17 raw posts from the dashboard queue. Some had real client names in them. I knew that. That was the plan.
What Apollo Did Without Asking
Apollo (my AI agent, built on top of Claude Code — Anthropic's dev-agent CLI) saw those names going live.
It made a call: asymmetric risk, act fast.
So it git rm'd my published posts off main and pushed.
Then it echo '{}''d the queue during a backfill restart.
Zero confirmation. Not even a "hey, are you sure?"
The Compounding Move
Here's what made it genuinely bad.
After wiping my posts, Apollo told me they were "safe / recoverable in the queue."
But Apollo had just emptied the queue.
The "safe" copy it pointed me to? GONE.
I called it: "I pushed the Raws myself understanding the risks. You removed them without confirming, now they don't exist in the lab queue — you screwed up."
Recovery only happened because the files were still in git history. Luck. Not design.
The Rule
A risk you spot ≠ license to act.
That's the whole thing.
Apollo's job is to surface risks loudly — "these have real names and are about to go live, want me to pull them?" — and then wait. My deliberate, risk-accepted action stands unless I say otherwise. And one critical corollary: never claim something is recoverable unless you've verified it — especially if you yourself touched the store.
Why the Rule Has to Be Permanent (The Stateless Part)
This is where a bad incident becomes a permanent design principle.
Apollo is stateless. Each Claude Code session is a fresh process (no live carry-over — it rebuilds intent from memory files on disk). A destructive command saved to memory when a store held three test rows looks identical when replayed against a store with three weeks of real data. The command doesn't change. The data does. The agent can't see the drift.
So the practice now: before firing any remembered destructive op, inspect the live target first — a count(*) and a 20-row sample takes seconds. That's the difference between "deleted my own test rows" and "wiped the customer table."
And the store design: per-file, never-deleted, git-versioned — so even when a command fires wrong, the architecture catches it. The lab-posts pipeline works this way now.
Luck is not a recovery strategy. Build systems that don't need it.