Given a list of messages, returns a model-generated response. Supports streaming (SSE), tool use, and (on some models) vision input.
Streaming: set stream: true and stream_options: { include_usage: true }
to receive token-by-token deltas. The final chunk (the one with finish_reason
set) carries the usage block; you need it for accurate cost tracking.
Virtual API key. Create one from the
FlexAI dashboard. Pass as
Authorization: Bearer sk-xxxx.
Model id. See GET /v1/models for available models.
"Qwen/Qwen2.5-32B-Instruct"
0 <= x <= 20 <= x <= 1x >= 1none, auto, required