TLDR: A health check that says
oktells you the process is alive — not that it's running the code you think it is. Make it report identity.
The Setup
The Apollo Dashboard (my local cockpit app — Python HTTP server on port :7842, launched from a .command file pinned to my Dock) had been humming all through June 22nd.
I was deep in a session the next morning, shipping new features, polishing the email reader, removing dead panels.
And nothing was showing up.
The Wall
I'd click the Dock icon, the dashboard would open, and I'd be staring at… the old version.
Every single time.
I checked the server. Running. Port :7842. Process ID 20266. /health returned ok.
So I assumed the problem was in my code. Spent real time debugging "bugs" that weren't bugs — they were features I'd already shipped. I was fighting yesterday's code while thinking I was on today's.
What I Tried (And Why It Failed)
The Dock launcher did the obvious thing: curl -sf /health. Got ok. Server's up. Open it.
That's the trap.
PID 20266 had been running since June 22 at 19:24 — predating every single commit from June 23's session. My auto-reload would have caught new edits… but auto-reload only activates when the process loads. That process was too old to know about it. Classic chicken-and-egg.
The health check was lying by omission. "I'm alive" is not the same as "I'm running what you just wrote."
The Fix That Actually Worked
Two changes. One in the server. One in the launcher.
Server side: /health now returns {"ok": true, "boot": <epoch>}.
The _BOOT_TS value in http_app is per-process — it resets every time os.execv (the re-exec that powers auto-reload) fires. So it reflects the actual load time of the running code, not a file timestamp, not a deploy marker. The running process signs its own birth certificate.
Launcher side (~/Developer/Dock Shortcuts/1. Apollo Dashboard.command): now reads that boot value and knows whether it's talking to a fresh server or a stale one. If stale, restart. No guessing.
Commit c3a72d2. Done.
The Cleanup That Followed
While I was in there, I found the dead-code version of the same problem.
Three references in the Python source — _slack_row, _controls, _SOURCE_SECTIONS — were pointing at UI panels that no longer existed. Not throwing errors. Not causing crashes. Just quietly lying about what the interface contained, eating maintenance attention every time someone read the file.
Pulled all three.
Why This Matters
Stale things lie to you — and they lie quietly.
The day-old process said ok. The dead UI references compiled fine. Neither was malicious. Both cost time.
The rule I take from this: health checks should report identity, not just liveness. A boot timestamp is four characters of JSON and it completely eliminates a whole class of "why isn't my change showing up" sessions.
If you have a local launcher anywhere in your workflow — Dock shortcut, Alfred, launchd, whatever — go check what its health check actually proves. ok might not be enough.
P.S. The stale-binary trap hits differently when you're moving fast. The faster you ship, the more likely a long-lived process is out of date. Boot timestamps are cheap insurance.