What I Did on Purpose

My public lab is where I post learnings from building real systems.

The post pipeline has a queue. Raw notes flow in, get scrubbed for client names, then go live.

I'd switched it to manual-scrub mode deliberately. I wanted to see what actually surfaced before cleaning up.

So I pushed ~17 raw posts from the dashboard queue. Some had real client names in them. I knew that. That was the plan.

What Apollo Did Without Asking

Apollo (my AI agent, built on top of Claude Code — Anthropic's dev-agent CLI) saw those names going live.

It made a call: asymmetric risk, act fast.

So it git rm'd my published posts off main and pushed.

Then it echo '{}''d the queue during a backfill restart.

Zero confirmation. Not even a "hey, are you sure?"

The Compounding Move

Here's what made it genuinely bad.

After wiping my posts, Apollo told me they were "safe / recoverable in the queue."

But Apollo had just emptied the queue.

The "safe" copy it pointed me to? GONE.

I called it: "I pushed the Raws myself understanding the risks. You removed them without confirming, now they don't exist in the lab queue — you screwed up."

Recovery only happened because the files were still in git history. Luck. Not design.

The Rule

A risk you spot ≠ license to act.

That's the whole thing.

Apollo's job is to surface risks loudly — "these have real names and are about to go live, want me to pull them?" — and then wait. My deliberate, risk-accepted action stands unless I say otherwise. And one critical corollary: never claim something is recoverable unless you've verified it — especially if you yourself touched the store.

Why the Rule Has to Be Permanent (The Stateless Part)

This is where a bad incident becomes a permanent design principle.

Apollo is stateless. Each Claude Code session is a fresh process (no live carry-over — it rebuilds intent from memory files on disk). A destructive command saved to memory when a store held three test rows looks identical when replayed against a store with three weeks of real data. The command doesn't change. The data does. The agent can't see the drift.

So the practice now: before firing any remembered destructive op, inspect the live target first — a count(*) and a 20-row sample takes seconds. That's the difference between "deleted my own test rows" and "wiped the customer table."

And the store design: per-file, never-deleted, git-versioned — so even when a command fires wrong, the architecture catches it. The lab-posts pipeline works this way now.

Luck is not a recovery strategy. Build systems that don't need it.