Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.flex.ai/llms.txt

Use this file to discover all available pages before exploring further.

There are two ways to enumerate the catalog at runtime, and they cover different audiences.
EndpointAuthReturnsUse when
GET /api/modelsNone (public)Every model in the catalog — text, code, reasoning, embeddings, and vision/multimodal. Includes per-token pricing, capability tags, and live health_status.You want one public call that surfaces the whole catalog with capabilities and pricing.
GET /v1/modelsBearer keyText, chat, and embedding models in OpenAI’s Model shape.Your client already speaks OpenAI’s /v1/models.
/v1/models returns the OpenAI-shaped list. For richer metadata — capability tags, health, and per-token pricing — use /api/models.

Calling /api/models

No auth, no headers, no params. Returns a JSON array of model objects.
curl https://tokens.flex.ai/api/models | jq '.[0]'
Each entry follows this shape:
{
  "slug": "qwen-qwen2-5-32b-instruct",
  "model_name": "Qwen/Qwen2.5-32B-Instruct",
  "display_name": "Qwen 2.5 32B Instruct",
  "description": "Qwen 2.5 32B — instruction-tuned text model ...",
  "category": "text",
  "context_window": 32768,
  "max_output": 8192,
  "input_per_mtok": 0.80,
  "output_per_mtok": 0.80,
  "supports": ["chat", "tool_use", "streaming"],
  "pricing": [],
  "health_status": "healthy",
  "provider": "Qwen"
}
The fields you’ll filter on most often:
  • category — one of text, code, reasoning, multimodal, embedding. Coarse-grained.
  • supports[] — capability tags like chat, tool_use, streaming, vision, reasoning, embeddings. This is what to filter on when you care about a specific capability.
  • model_name — the value to pass as model in subsequent API calls. (slug is the dashboard URL form; model_name is the API form.)
  • input_per_mtok / output_per_mtok — per-token pricing (USD per million tokens).
A few filter examples:
# All tool-calling models
curl -s https://tokens.flex.ai/api/models \
  | jq '[.[] | select(.supports | index("tool_use"))]'

# All vision-capable models (accept image input on /v1/chat/completions)
curl -s https://tokens.flex.ai/api/models \
  | jq '[.[] | select(.supports | index("vision"))]'

# Just the names and prices of embedding models
curl -s https://tokens.flex.ai/api/models \
  | jq '[.[] | select(.supports | index("embeddings")) | {model_name, input_per_mtok}]'

Per-capability discovery flow

Each subsection: how to find the models, then the endpoint to call once you have a model_name.

Chat & text

curl -s https://tokens.flex.ai/api/models \
  | jq '[.[] | select(.supports | index("chat")) | .model_name]'
Call with POST /v1/chat/completions. See streaming and tool use for the common patterns.

Vision (image input on chat)

curl -s https://tokens.flex.ai/api/models \
  | jq '[.[] | select(.supports | index("vision")) | .model_name]'
Call with POST /v1/chat/completions and pass image parts in the content array. The full pattern lives in the vision guide.

Embeddings

curl -s https://tokens.flex.ai/api/models \
  | jq '[.[] | select(.supports | index("embeddings")) | .model_name]'
Call with POST /v1/embeddings. See the embeddings guide.

See also

  • Model catalog — human-readable table of every hosted model, with capabilities and pricing.
  • OpenAI compatibility — what’s in and out of /v1/*, including the scope of /v1/models.
  • Billing — how per-token pricing maps to charges.