How a Single Batching Change Cut Our Shopify Sync to 1/50th the API Calls

TLDR: Switching from row-by-row upserts to 100-row batches in our Shopify → Supabase orders sync cut API round-trips by ~50×. The fix was five lines. The lesson took longer to internalize.

The Setup

We're building a supply chain OS for an ecommerce business — pulling Shopify orders, inventory levels, and refund data into a Postgres database (via Supabase, the hosted Postgres layer I've been using a lot lately) to power reporting dashboards and velocity analysis.

The first version of our orders sync script worked. It pulled orders from Shopify and upserted them one at a time into Supabase.

One. At. A. Time.

What Slow Actually Looks Like

It wasn't broken. The data was correct. But every single order was its own HTTP round-trip to PostgREST (the REST layer over Postgres that Supabase exposes).

Dozens of orders? Fine. Thousands of orders? You sit there waiting. And if the sync crashes halfway through — network blip, timeout, whatever — you don't know where it stopped. You restart from zero. Again.

That's two problems: it's slow, and it's fragile.

The Fix That Worked

Batching. Genuinely that simple.

Instead of upsert-one-wait-upsert-one-wait, collect 100 rows and upsert them in a single call. ~50× fewer round-trips. Not 50% fewer — fifty times fewer.

The same day, I wired in a bookmark: after every 50 orders processed, persist a last_synced cursor to the database. Now if it crashes, it picks up from the last checkpoint instead of restarting from scratch.

That's the whole performance story. Batch your writes. Checkpoint your progress. Ship it.

The Trap Batching Introduces

Here's where it gets interesting — and where you can lose an afternoon if you don't know this.

When you batch-upsert a heterogeneous array in supabase-js (rows with different keys), PostgREST normalizes them to a uniform column shape on the wire. Any key present on any row gets added to all rows — with NULL for the ones that didn't have it.

The problem: NULL is not the same as "use the schema default." Schema defaults only kick in for omitted columns. An explicit NULL on a NOT NULL column will trip a constraint. An explicit NULL on a column with a CHECK constraint might sneak through (since NULL is UNKNOWN, not FALSE) — but you can't count on it.

Defensive pattern: for every NOT NULL column without a JS-side default, normalize it explicitly before the batch hits the wire. row.status = row.status ?? 'ACTIVE'. One line per required column. Done.

The other gotcha: by default, .update() and .upsert() in supabase-js swallow constraint and RLS rejections and return a null error. The row doesn't change. Your app thinks it succeeded. You find out on a dashboard three days later.

Fix: add Prefer: return=representation to your upsert calls. Forces PostgREST to surface rejections instead of silently dropping them.

.upsert(rows, { onConflict: 'store,shopify_order_id' })
// internally becomes a PATCH with Prefer: return=representation

Why This Matters to Me

I keep reaching for the "simplest thing that works" in early builds — and row-by-row upserts are the simplest thing. They're easy to reason about. But simplest is not always safe when you're syncing thousands of rows against a real schema.

The 50× speedup is nice. What I care about more is the combination: fast, resumable, and loud when it breaks. That's a sync you can actually trust.

P.S. The bookmark pattern turned out to matter more than the batching when a Shopify API timeout hit mid-sync the following week. Picked up from row 1,050. Would have been row 0 without it.