Why I Budget Fifteen Minutes After Every Overnight Agent Build

TLDR: Cold-start bugs don't live in your shell — they live in the real launch environment. You can't find them until you run there.

The Setup

Apollo — my AI agent fleet, running as launchd (macOS's background service runner) daemons — built a local Rize-style auto time tracker overnight while I slept.

I woke up to four running daemons, a FastAPI server, a menubar app.

The whole thing was live at 8:20am.

Then I watched the first five minutes happen.

The Three Bugs That Weren't in Development

They surfaced immediately, in the real launchd environment:

httpx 308 redirects — the capture agent was silently bouncing off a redirect it wouldn't follow
Claude binary path — couldn't autodetect when launched headlessly as a background service
SQLite busy_timeout — database locking under concurrent daemon writes

None of these showed up during development.

Because I never tested in the actual launch environment.

All three diagnosed, fixed, committed — about 15 minutes. Which felt good! But the obvious question nagged at me: why didn't I catch these earlier?

The Same Lesson, Harder

A few days later I got my answer.

I shipped a new "activity axis" feature to the time tracker, ran it fine in my terminal, committed it, wired it into launchd.

TCC popup storm. Immediately.

TCC (macOS's privacy permission system — the dialogs asking "can this app access your contacts/calendar?") does NOT behave the same way for headless claude -p (Claude running without a UI) subprocesses launched from launchd as it does for your interactive shell. It just fires modal after modal until you kill everything and revert the commit.

So I did.

The shell is LYING to you. That's the real lesson. The launch environment is a different environment, and no amount of local testing proves your code works there.

A Third Data Point: the Things 3 MCP Race

On June 11, the CEO briefing daemon spun up a fresh headless session. The Things 3 MCP (my tool bridge to Things 3, my task manager) was still connecting when the agent's first read fired.

get_today returned nothing.

The Task Pulse section in the briefing came back completely empty.

But the MCP finished loading mid-run — so a write call a few minutes later worked fine. One session: missed the read, landed the write.

I'd conflated "MCP not loaded yet" with "no tasks." They are not the same thing. Don't make that call from a single early read in a cold headless session.

What I Do Differently Now

Three concrete changes:

Test in the real launch context. Drop a temp ~/Library/LaunchAgents plist, bootstrap it with launchctl, run with --output-format stream-json --verbose. That's the environment that matters — not your shell.
Warm up MCP connections before the first real read. In headless daemon flows, fire a throwaway read up front or add a retry. An early empty result ≠ "no data." It might just mean "not loaded yet."
Budget a first-fifteen-minutes fix pass after any overnight build. Expect it. It's not a failure — it's the normal shape of shipping into a context you couldn't fully simulate.

That last one took me a while to accept. Autonomous agents don't ship clean. They ship close — and the gap closes in the first run. Plan for the gap.