https://tokens.flex.ai/v1 and your existing code works unchanged.
Quickstart
Get a key and make your first request in under two minutes.
Model catalog
Browse every model we host, with context windows and pricing.
Streaming
Stream tokens token-by-token with usage tracking.
Vision
Send images to multimodal models in the OpenAI
image_url shape.Tool use
Call functions from model responses.
What you get
- One API key, every model. Chat, completions, images, video, and audio all authenticate with the same
sk-…bearer token. - Per-key budgets and rate limits. Every key has its own spend cap and requests-per-minute limit. Exceed either and you get an unambiguous 402 or 429 with headers explaining why.
- Dollar-denominated credits. Top up with a credit card; spend down at per-model rates. No token packs, no subscriptions.
- $10 of free credit on signup so you can evaluate the API before adding a card.
OpenAI compatibility
The surface is intentionally identical to OpenAI’s for the endpoints we support. If your code runs againstapi.openai.com, swap the base_url and the model id — that’s the whole migration. See the compatibility matrix for what’s in and out at launch.
Where things live
| Surface | URL |
|---|---|
| Production API | https://tokens.flex.ai |
| Staging API | https://tokens.flexsystems.ai |
| Dashboard (keys, billing, usage) | https://tokens.flex.ai/dashboard |
| Status | https://status.flex.ai |
| Docs (this site) | https://docs.flex.ai/inference-api |