TLDR: Strip sensitive content before it hits your vector DB — not at query time. One LLM pass at ingest beats a thousand prompt instructions at retrieval.

the build

An internal frozen SOP knowledge base built on Next.js 16 needed a way to actually answer questions from the docs.

Not keyword search. Answer. With citations. In a bottom-drawer chat that pops open with Cmd+K.

The source material was a mix of node .md SOP files and raw Loom, Zoom, and Make (an automation platform) call transcripts.

Which meant before any of it touched a database, we had a problem.

the pipeline

Four stages: chunk → strip → embed → SQLite.

Chunk: break each source file into coherent pieces — speaker turns, paragraph breaks, ~600 words per chunk. Nothing exotic here.

Strip: this is the interesting one.

Every chunk goes through a Sonnet 4.6 LLM pass before the embedding call. The prompt is blunt: drop anything that isn't systems-level content. Team member names, personal tangents, strategy, opinions — gone. What stays is the operational system knowledge, and nothing else.

Embed: OpenAI text-embedding-3-small, 1536 dimensions. (Started with Voyage — swapped per preference, not a failure. Cleaner single-vendor stack.)

Store: a single data/rag.db file via sqlite-vec (a SQLite extension that stores and queries float vectors). Committed to the repo. Frozen at launch. No external vector service to wrangle.

the wall

Getting this to actually run on Vercel nearly wrecked me.

better-sqlite3 and sqlite-vec are native binaries. Vercel's serverless bundler doesn't know what to do with them — it quietly leaves them out and you get runtime errors that are completely opaque about why.

What did I try first? Blamed the import. Blamed the module version. Blamed my Next.js config. (Wrong, wrong, wrong.)

the fix

Two things, once I understood what was actually happening:

  1. Mark native modules external in next.config.ts — stop the bundler from trying to inline them
  2. Add data/rag.db to outputFileTracingIncludes — explicitly tell Vercel to pull the committed SQLite file into the function bundle

Both feel obvious in hindsight. Neither is where you'd naturally look in the docs.

why this matters

The strip step is the thing I hadn't thought through until we were mid-build.

Here's the real problem with a "don't reveal private information" system prompt at query time: it's a request. The private content is already embedded. Retrieval can surface it. Synthesis can leak it under the right question — or the wrong one.

Strip at ingest and none of that content exists in the vector space to begin with.

It's the difference between "please don't look at that file" and shredding the file. One is a policy. The other is a fact. The strip step is the shredder.

Build the pipeline in the right order and you never have to worry about it again.

P.S. The whole rag.db is committed and bundled — no external vector service, no API key, no infra to break at 2am. For a frozen read-only knowledge base, that's EXACTLY the right architecture.