TLDR: If you have a work queue feeding a slow, expensive consumer, bound it with FIFO eviction. The oldest items are the most stale. An unbounded queue that erodes the reliability of your state layer is a worse failure than dropping items you were never going to get to anyway.

the setup

I've been building a two-stage AI scanner pipeline — a system that watches for MCP (Model Context Protocol, the auth layer for AI tool servers) security events, classifies each one with Sonnet, then queues the flagged items for a deeper Opus review.

The work queue between those two stages lives in state.py as a list called pending_opus_review.

Simple enough. Except I never bounded it.

the rule that said don't do this

I have a rule I wrote for myself (it's in my notes, I refer to it constantly): before you cap anything, ask whether the cap drops distinct information or just duplication. If it's distinct info — don't cap.

A list of items waiting for Opus review sure looks like distinct information. Every entry is a flagged security event. Dropping one means it never gets the deep-review pass.

So my own rule said: leave it alone.

why i did it anyway

The hardening sprint that forced my hand was all about making state.py bulletproof — atomic writes via os.replace, batched O(1) state updates, per-entry crash guards on prune_processed.

Somewhere in that work it clicked: the whole state object gets written to disk on every cycle.

An unbounded pending_opus_review means that list could grow to thousands of entries. Which means the "crash-safe atomic write" I'd just spent a day hardening… gets slower and heavier every single run. The very thing I was protecting was being quietly undermined by the thing I wasn't looking at.

And here's the real insight that broke the rule: stale queue entries aren't distinct information. They're just old backlog.

If the Opus consumer is running behind — which it will, Opus is slower and more expensive than Sonnet — the oldest items in that queue are the least actionable. Something flagged three scan cycles ago that hasn't been reviewed yet is noise at this point. The freshest flags are what I actually care about.

FIFO eviction (drop the oldest first when you hit the cap) is exactly right here. The oldest items cost the most to keep and matter the least to ship.

the fix

One conditional in state.py. If len(pending_opus_review) >= 50, pop from the front before appending to the back.

if len(self.pending_opus_review) >= MAX_PENDING_REVIEW:
    self.pending_opus_review.pop(0)  # FIFO: evict oldest
self.pending_opus_review.append(item)

50 items. That's the cap. Not 500. Not "let's see how it goes."

why this matters

The real lesson isn't "use FIFO." It's that a queue feeding a slow consumer will grow without bound if you let it — and that growth is never free. It erodes writes, bloats state, and makes recovery unpredictable after a crash.

My own rule about not capping distinct information is still right. But I had the wrong mental model about what was in that queue. I was thinking about the items as important flags. I should've been thinking about them as a rate mismatch buffer. Those are different things, and they get different treatment.

Bound your buffers. Pick FIFO when recency matters. And when your own rules seem to say no — figure out which frame you're applying to the wrong thing.

P.S. The cap is a config constant now. If the Opus throughput changes significantly, I'll tune it. But starting at 50 and measuring beats starting at ∞ every time.