TLDR: Cold-start bugs don't live in your shell — they live in the real launch environment. You can't find them until you run there.
The Setup
Apollo — my AI agent fleet, running as launchd (macOS's background service runner) daemons — built a local Rize-style auto time tracker overnight while I slept.
I woke up to four running daemons, a FastAPI server, a menubar app.
The whole thing was live at 8:20am.
Then I watched the first five minutes happen.
The Three Bugs That Weren't in Development
They surfaced immediately, in the real launchd environment:
httpx 308 redirects— the capture agent was silently bouncing off a redirect it wouldn't follow- Claude binary path — couldn't autodetect when launched headlessly as a background service
- SQLite
busy_timeout— database locking under concurrent daemon writes
None of these showed up during development.
Because I never tested in the actual launch environment.
All three diagnosed, fixed, committed — about 15 minutes. Which felt good! But the obvious question nagged at me: why didn't I catch these earlier?
The Same Lesson, Harder
A few days later I got my answer.
I shipped a new "activity axis" feature to the time tracker, ran it fine in my terminal, committed it, wired it into launchd.
TCC popup storm. Immediately.
TCC (macOS's privacy permission system — the dialogs asking "can this app access your contacts/calendar?") does NOT behave the same way for headless claude -p (Claude running without a UI) subprocesses launched from launchd as it does for your interactive shell. It just fires modal after modal until you kill everything and revert the commit.
So I did.
The shell is LYING to you. That's the real lesson. The launch environment is a different environment, and no amount of local testing proves your code works there.
A Third Data Point: the Things 3 MCP Race
On June 11, the CEO briefing daemon spun up a fresh headless session. The Things 3 MCP (my tool bridge to Things 3, my task manager) was still connecting when the agent's first read fired.
get_today returned nothing.
The Task Pulse section in the briefing came back completely empty.
But the MCP finished loading mid-run — so a write call a few minutes later worked fine. One session: missed the read, landed the write.
I'd conflated "MCP not loaded yet" with "no tasks." They are not the same thing. Don't make that call from a single early read in a cold headless session.
What I Do Differently Now
Three concrete changes:
-
Test in the real launch context. Drop a temp
~/Library/LaunchAgentsplist, bootstrap it withlaunchctl, run with--output-format stream-json --verbose. That's the environment that matters — not your shell. -
Warm up MCP connections before the first real read. In headless daemon flows, fire a throwaway read up front or add a retry. An early empty result ≠ "no data." It might just mean "not loaded yet."
-
Budget a first-fifteen-minutes fix pass after any overnight build. Expect it. It's not a failure — it's the normal shape of shipping into a context you couldn't fully simulate.
That last one took me a while to accept. Autonomous agents don't ship clean. They ship close — and the gap closes in the first run. Plan for the gap.