How a 'Smart' Polling Window Silently Ate Two Weeks of Transcripts

Two touches only: fix the garbled grammar on the ecommerce line, optionally drop the internal file path. SafeWatermark stays — it's already its own description. Here's the cleaned post:

TLDR: If your polling job widens its lookback window on failure, it will silently replace your watermark with a guess. The fix: the cursor only ever advances over completed batches — never over where you happened to look.

the setup

I've been building incremental sync into almost everything lately.

Apollo (my personal AI agent) polls ~/Library/Messages/chat.db every 5 seconds and writes the last-processed ROWID to a state file so restarts don't replay old messages.

Fathom (my meeting recorder) has a Python ingest job that writes last_poll to processed.json after each successful API call — same idea.

And a supply-chain app I built for an ecommerce business syncs Shopify orders incrementally, bookmarking progress as MAX(shopify_updated_at) for each store.

Same pattern everywhere. I thought I had it figured out.

what broke

The Fathom ingest job died on April 29 — a missed launchd bootstrap.

No alert. I noticed on May 13. Restarted it.

And here's where the "clever" bit I'd built bit me hard.

the clever thing that wasn't

The script had a branch I was proud of: if any recording is stuck in pending_transcript (waiting for Fathom's transcript to process), widen the API fetch window to 3 days back instead of picking up from last_poll.

The logic seemed sound — cast a wider net for anything that might have slipped through.

Except when I restarted on May 13 with last_poll = 2026-04-29 AND one recording stuck pending… the wide-net branch kicked in.

It fetched May 10 forward.

April 30 → May 9: gone. Ten days of meeting transcripts, silently skipped.

I had to run --backfill 14 by hand to recover them. That stung.

the fix that actually works

The lesson I kept resisting: the watermark records what you have definitively finished processing, full stop.

Not "where I looked." Not "a window big enough to probably catch stragglers." What I finished.

On the supply-chain side I built SafeWatermark to enforce this mechanically.

The rule is simple:

The cursor only advances after a batch completes successfully
A partial or failed batch surfaces as a 207 — not a silent advance past the gap
If the job dies mid-batch, it restarts from the last confirmed position

No clever window-widening. No guessing. The watermark is a commitment, not a hint.

the proof it works

On June 8 I ran a full data-repair diagnosis across the supply-chain sync.

The finding I was most relieved to see: "Watermark loss: healthy. Bookmark == MAX(shopify_updated_at) both stores. SafeWatermark holding. No repair."

Zero data loss, even after multiple cron retries and batch failures in between.

why this matters to me

Every sync job I've written has eventually hit a restart. A crash. A missed cron.

The iMessage ROWID watermark was right from the start — simple, honest, advances only when we're done. The Fathom job tried to be smarter than that, and it cost me two weeks of transcripts.

Simple watermarks are not a limitation. They're the whole point.

Build your cursor to record what you've finished — and let that be enough.