Create a chat completion
Chat
Create a chat completion
Given a list of messages, returns a model-generated response. Supports streaming (SSE), tool use, and (on some models) vision input.
Streaming: set stream: true and stream_options: { include_usage: true }
to receive token-by-token deltas. The final chunk (the one with finish_reason
set) carries the usage block; you need it for accurate cost tracking.
POST
Create a chat completion
Authorizations
Virtual API key. Create one from the
FlexAI dashboard. Pass as
Authorization: Bearer sk-xxxx.
Body
application/json
Model id. See GET /v1/models for available models.
Example:
"Meta-Llama-3.1-8B-Instruct-FP8"
Required range:
0 <= x <= 2Required range:
0 <= x <= 1Required range:
x >= 1Available options:
none, auto, required