TLDR: An HTTP 200 from an LLM API tells you the request arrived. It says nothing about whether the model finished. Always check
stop_reasonand content length before trusting the response.
the setup
I've been building Apollo, my personal AI operating layer, and part of that is a subagent delegation system.
Big tasks go out to MiniMax M2.7 via api.minimax.io/anthropic (my default delegate), with GLM-5.1 via Z.AI as the fallback when MiniMax rate-limits or flakes.
Both expose Anthropic-compatible endpoints — same shape, familiar SDK, easy to wire up.
So I wired them up, shipped, and moved on.
the wall
A while later I'd notice: the delegate was called, I got a 200 back, Apollo reported "done"… and nothing happened.
No file changes. No output. The task just… evaporated.
My first instinct: retry. Same result. Retry again. Still nothing.
I was debugging the wrong layer entirely.
what was actually going on
The 200 was real. The task was a ghost.
Three specific failure shapes I eventually catalogued:
- ZAI: returns 200 with
stop_reason: "model_context_window_exceeded"andcontent[0].textcompletely empty. The model hit its context ceiling — and said so, quietly, inside a 200. - ZAI (again): returns 200 with
content[0].textof zero length and no clear stop reason at all. Root cause still not fully characterized — possibly content-specific. I genuinely don't know why this one happens and I'm not pretending otherwise. - MiniMax: returns 200 where
contenthas athinkingblock but NOtextblock, andstop_reason: "max_tokens". Because I had setmax_tokensabsurdly low at first. The model used its whole budget just thinking and ran out before it could answer.
That last one I earned.
the fix that actually worked
Stop trusting the envelope. Verify the artifact.
I now run this checklist on every delegate call, no exceptions:
- HTTP status is 200 ✓
stop_reasonisend_turn— notmodel_context_window_exceeded, notmax_tokenscontent[].textis non-empty and non-trivial- If a file was supposed to change, verify the file actually changed
- Stderr log at
/tmp/<provider>-*.logis clean
If anything in that list fails: do not retry blind. Diagnose. The 200 is a red herring. The real signal is in stop_reason and content length.
why this matters to me
I'd been thinking about "did the API call succeed" when I should have been thinking about "did the task succeed."
Those are different questions. HTTP is a transport layer. It knows nothing about whether the LLM finished what you asked.
Every builder who delegates to an LLM subagent — whether it's MiniMax, ZAI, or any Anthropic-compatible endpoint — will hit this exact trap. The response looks fine. The job isn't done.
Check stop_reason. Check the content. Then trust the result.
P.S. The
claude -pCLI has the same trap — exit code 0 is not a finished turn. Checkis_error == falseANDstop_reason in (None, "end_turn")AND a non-empty.resultbefore you parse anything.