stream: true on any /v1/chat/completions request.
Always set
stream_options.include_usage: true when streaming. Without it you cannot bill the request correctly on your side, and per-key spend tracking on ours silently loses information. Every server-sent event framework we recommend below does this by default.Example
Anatomy of the stream
The response istext/event-stream. Each event is a line beginning with data: followed by a JSON object, then a blank line:
delta.contentis the incremental piece — concatenate them to reconstruct the full response.- The penultimate chunk carries
finish_reason(stop,length,tool_calls, orcontent_filter). Itsdeltais typically empty. - The final-before-
[DONE]chunk carriesusage(only when you setstream_options.include_usage: true). This is the one source of truth for token counts; do not try to count tokens client-side.
Tool calls while streaming
Tool call arguments arrive asdelta.tool_calls[].function.arguments fragments that you accumulate the same way as delta.content. See the tool use guide for a full example.