TLDR: I built a feedback loop into my AI dashboard before I built the inspection surface. When I finally could see what it had learned, I immediately wiped everything and started over.

The Setup

Apollo (my AI ops assistant, running as a local dashboard app) reads my email and Slack and surfaces the stuff I actually need to act on.

The hardest problem in that loop isn't the reading — it's the filtering.

So I shipped a reasoned feedback loop: thumbs up/down on any email or Slack thread. The system writes the signal to disk instantly, applies it at render time with plain code — no model call, zero cost. Then on the next batch run, it generalizes to similar cases. Two layers, one persisted store.

That's the architecture. It took a while to get right, and I was proud of it.

The Wall I Hit

I shipped the feedback feature before I built the page that let me see what it had learned.

When I finally added /rules — the inspect/undo surface — email looked great. Readable sender addresses, example subjects, clear reasoning.

Slack looked like this: u07p1ern533.

Every rule for every Slack thread was keyed to a raw user ID I'd never seen in my life. No name. No example message. Completely opaque.

I had trained a feedback loop I literally could not read.

What I Tried First (And Why It Didn't Work)

My first instinct was to just resolve the IDs at render time from slack_users.json — the user cache the ingest already maintained. Simple enough.

Except nothing matched.

Took me embarrassingly long to spot it: the cache stores keys UPPERCASE. The feedback store wrote them lowercase. A one-character difference in casing, and every lookup returned empty. Fixed it with a case-insensitive lookup pass.

Then I pulled the example message text out of the stored examples field and surfaced it in the view — same pattern as email already used for subject lines. Added name primary + raw id muted-secondary + reason-or-example context line so a human could actually parse what the system had decided.

The Fix That Actually Mattered

Once /rules was readable, I looked at everything the loop had accumulated.

And I wiped it all.

Not because it had learned wrong things — I genuinely don't know, because I couldn't validate what I couldn't read. That's the point. A feedback loop you can't inspect is one you can't trust. So I hit "Clear all rules," confirmed the modal, and started training from scratch with a surface I could actually audit.

Commit 0034b01. Both stores reset to blank. Zero threads, zero sender rules, zero topics.

Why This Matters

If you're building a learning loop — thumbs, ratings, corrections, any signal that accumulates over time — ship the inspection surface first.

Not as a phase-two nice-to-have. As part of the feature.

The feedback is only as useful as your ability to read it back and say yes, that's right or no, not quite. And if your data model uses opaque IDs anywhere in the chain, resolve them to human labels before a human ever has to look at them.

The wipe wasn't a failure. It was the first thing I could do with confidence once I could actually see what I was working with.

P.S. Also: if your instant layer and your batch layer have different latencies, tell your users explicitly. An unchanged surface after a thumb isn't broken — it just needs the next batch run. That sentence alone would have saved me twenty minutes of second-guessing my own code.