Why Prompts Don't Hold When One Tool Talks to Five Businesses

TLDR: When one MCP server exposes five businesses and you only want three of them, a prompt exclusion is not a wall. Put the allowlist in code, at the boundary where the output is consumed.

The Setup

Apollo (my AI orchestrator) runs an hourly scanner.

Its job: check Slack, Gmail, and Calendar across my active client workspaces — an ecommerce business, a second client, and a third client — via Arcade (my MCP auth provider), then surface anything that needs attention.

I built it carefully. collector.py had a whole ## OUT OF SCOPE — NEVER ATTEMPT section. A former client workspace was Signal-only — it never had a Slack, period. The prompt was explicit.

The Problem at 10:29 AM

On April 24th, Sonnet (the Claude model doing the scanning) filed this in auth_status:

slack_former_client: needs_reauth

A former client. Slack. A channel that didn't exist, for a client that had no Slack.

main.py didn't know that. It just saw a key that looked like a legit auth failure and created a "Fix Apollo Auth: slack_former_client" task in Things (my task manager).

Noise. Phantom task. Completely made up by the model.

I noticed it, killed the task — and started thinking harder about what went wrong.

What I Thought Would Work (Didn't)

I figured the prompt would hold.

"The former client has no Slack/Gmail/Calendar/Notion/Drive — Signal only." Clear as day, right there in the context.

But here's the thing: Sonnet drifted anyway. Not every run, not every time. Just... occasionally. And "occasionally" is enough to create garbage in production when your downstream consumer (in this case, main.py) trusts the model's output at face value.

The prompt is advisory. The code is law.

The Fix That Held

Two layers, belt-and-suspenders:

Prompt-level — kept the ## OUT OF SCOPE section; also mirrored the scope rules into rules.md so Sonnet gets them in per-scan context.
Code-level — added an IN_SCOPE allowlist to main.py:

IN_SCOPE = {
    "gmail_ecomm", "gmail_client_b", "gmail_client_c", "gmail_client_d",
    "slack_ecomm", "slack_client_b", "slack_client_c",
    "calendar_ecomm", "calendar_client_b", "calendar_client_c", "calendar_client_d",
    "signal",
}

Now, before any auth key can become a Things task, it has to be in that set.

Anything outside it gets logged as "Ignored out-of-scope auth keys" and silently dropped.

slack_former_client never reaches Things again. Doesn't matter what Sonnet reports.

Why This Matters to Me

I used to think scope isolation was a prompting problem.

Write the right instructions, the model stays in its lane. And mostly that's true — but "mostly" isn't a safety guarantee when the model is taking actions downstream.

The pattern is simple: enforce at the consumer boundary, not just the prompt boundary. When one MCP tool has five clients in it and you only mean three, the allowlist in your code is the only thing you can actually trust.

The prompt tells the model what to do. The code decides what's allowed to happen.

P.S. Arcade's arcade-business server exposed an ecommerce business, a second client, a third client, and a former client through the same shared Slack provider. If your auth layer bundles multiple tenants, this is the pattern — not tighter prompting.