Three targeted changes:
the my wife firewall→an internal firewall(project codename → what it is, rule 3)scanner/in both the prose and the code block →the agent's project repo/the-agent-project/(repo name, rule 3)$0→free(dollar amount, rule 4)
TLDR:
launchctl bootstrapfrom a repo path doesn't survive a reboot. Copy the plist to~/Library/LaunchAgents/. But the actual lesson is deeper — if your autonomous agent can be dead for three weeks and you don't notice, you need a heartbeat, not a better plist.
The Build
Five days. Five phases. One autonomous pipeline that silently watched my calendar and wrote pre-call research notes to Obsidian (my personal note vault) before every meeting.
The phases:
- Phase 1 — Poll 5 Google calendars every 5 minutes via launchd (macOS's background-job scheduler), fire 15 minutes before a meeting starts
- Phase 2 — Resolve the counterparty: company, relationship history, prior meeting notes from the vault
- Phase 3 — Research pipeline gated behind an internal firewall (an allowlist controlling exactly which tools the agent can reach)
- Phase 4 — Obsidian writer drops a formatted prep note in the right folder before you walk in
- Phase 5 — State management:
precall_seen.json, atomic writes, 7-day prune so it doesn't re-research the same meeting twice
Each research run spawned a headless claude -p --model sonnet subprocess with a scoped --allowedTools whitelist. About 60–120 seconds per meeting. Free, because I'm on Claude Max.
It went LIVE on May 13th. My system map said so: "LIVE since May 13." I was pleased with myself.
What Killed It
May 14th, 19:31 — it died. I didn't know for three weeks.
The bug is embarrassingly simple.
I bootstrapped the launchd plist (the config file that tells macOS to run a script on a schedule) directly from its path inside the agent's project repo:
launchctl bootstrap gui/$(id -u) /path/to/the-agent-project/com.apollo.precall.plist
That works this session. It does NOT survive a reboot. The plist has to live in ~/Library/LaunchAgents/ for macOS to persist it.
I rebooted. The agent vanished. MEMORY.md still said LIVE.
Three Weeks of Silence
Three weeks of external 1:1s. No prep notes hitting my vault. I never once thought huh, where are my notes?
When I finally did the forensics on June 4th, it got worse — the last logs before death weren't clean exits. They were calendar fetch error business=* message=timeout. The agent was already sick before the reboot finished it off.
The thing I'd been trusting as LIVE had been limping for at least a day before it died.
Why I Retired It Instead of Fixing It
The plist fix is a one-liner. I chose not to.
Because the real problem isn't the path. It's that my system had no way to tell me it was dead. No heartbeat. No dead-man's-switch. No log line I would have spotted. The gap between claims-LIVE and is-running only closed because I happened to audit it six weeks after launch.
An autonomous agent that fails silently isn't saving you time. It's accumulating a lie.
What to Wire In Before You Deploy
If you're building anything autonomous — launchd agent, cron, scheduled AI pipeline — wire the liveness signal first:
- Copy the plist to
~/Library/LaunchAgents/— never bootstrap from the repo path - Emit a heartbeat: one log line per successful cycle, greppable and alertable
- Add a dead-man's-switch: no heartbeat for N hours → ping yourself
- Record retirement reason when you shut something down, then unload the plist so it can't resurrect on the next reboot
The architecture held up. The deployment didn't.
And "nobody missed it for three weeks" turned out to be the most useful signal of all — it told me the thing I'd spent five days building wasn't actually load-bearing enough to fight for.