TLDR: I assumed I knew the shape of an external API response. I was wrong. 849 locations got corrupted before I noticed. The fix is boring — run the live query first, read the JSON — but I keep skipping it. So now I've made it a rule.

The Setup

I was building an oncology center directory, a site that helps people locate cancer treatment facilities.

The location data lives in TakeShape (a headless CMS with a GraphQL API, my content backend for the project).

I wrote a parser to pull physicalLocations — the nested field that holds each facility's address data — and feed it downstream into the search index.

I wrote TypeScript interfaces. They looked reasonable. I ran the import. It worked — or so I thought.

What Actually Happened

The shape I assumed wasn't the shape on the wire.

The parser fell back silently to "Unknown" for every location it couldn't resolve.

That's the killer detail: I had a fallback. It swallowed the error cleanly. No exception, no red log line, no failing test. Just 849 facility records sitting in the database with corrupted location data, quietly waiting for someone to notice.

It took a while.

What I Tried First (The Wrong Fix)

My first instinct was to re-run the full import.

I almost did it. The full import is a single command — simple, familiar, fast.

But a full reimport nukes manually-set fields: weight, hubspot_link, editorial overrides that don't live in TakeShape. The blast radius was way higher than the actual problem.

I stopped, wrote a targeted backfill script that only touched the broken column, and ran that instead.

Same outcome. A fraction of the risk.

The Fix That Worked

Before writing a single line of parser code:

  1. Run the actual live query
  2. console.log(JSON.stringify(data, null, 2))
  3. Read every nested level yourself

That's it. Thirty seconds. The real shape was right there.

The TS interfaces I'd written were hypotheses. The wire is the spec — always.

It Wasn't a One-Off

The same week I was fixing TakeShape, I hit this twice more on the QuickBooks MCP server (an MCP tool that talks to the QuickBooks Online API, my accounting integration).

AccountRef needs to be nested inside AccountBasedExpenseLineDetail. I had it flat. Bill creation silently failed.

SyncToken, DocNumber, TxnDate — missing entirely from my schema. I didn't guess wrong, I didn't guess at all.

Then there's Recharge (a Shopify subscription billing API). I assumed external_product_id was a string. It's a nested object. That assumption burned an entire backfill run — 28,700 orphan rows — before we caught it.

Three different APIs. Three different projects. Same category of mistake.

The Rule I'm Now Enforcing

  • Primary source the API. Run a live query before writing any parser.
  • Treat every hand-written TS interface against an external API as a hypothesis, not a spec.
  • For GraphQL unions specifically: query for __typename alongside the data and check what shapes actually come back — the type definitions and the wire can disagree.
  • When you add a silent fallback, log a warning the first time it fires. If "Unknown" shows up in production and you never saw the warning, you don't know how broken things are.
  • When fixing a schema mismatch that touched many rows: targeted backfill, not a full reimport. Lower blast radius, same outcome.

Why This Matters to Me

The thing I keep underestimating is how confident I feel before the mistake.

I've read the docs. I've written the types. The import ran without throwing. Everything feels verified.

But the code I wrote reflects what I expected the API to look like — not what it actually is. Those are two different things, and I keep confusing them.

The live query takes thirty seconds. The backfill took hours. That math is not complicated.