Every model you and your agent need — one OpenAI-compatible key.
FlexAI Token Factory is an OpenAI-compatible inference API for open text, code, reasoning, vision, and embedding models. Point any OpenAI SDK at tokens.flex.ai and your existing code works unchanged.
Add a billing address and card to create your API key, then pay per token — no packs, no subscriptions. See billing.
Get an API keyRead the quickstartDrop-in OpenAI compatibility
Already have code that calls OpenAI? Change the base URL and key — nothing else.Why Token Factory
OpenAI-compatible
A drop-in for the OpenAI SDK. Swap the base URL and model id — keep your code, prompts, and tools.
Every model, discoverable
Text, code, reasoning, vision, and embedding models behind one key. Filter the live catalog from code with model discovery.
Transparent per-token pricing
Dollar-denominated, per-token billing — no token packs, no subscriptions. Account-level budgets and rate limits.
Start building
Overview
What the API is, what you get, and where everything lives.
Quickstart
Get a key and make your first request in under two minutes.
Guides
Streaming, tool use, vision, embeddings, model discovery, batching, and concurrency.
API reference
Endpoints, the compatibility matrix, errors, and billing.
Scale beyond serverless
The serverless API is the fastest way to start. When you need more, the FlexAI platform also offers dedicated inference endpoints, fine-tuning, training, and private deployments.Explore the platform
Dedicated endpoints, fine-tuning, training, and platform services.
flex.ai
Pricing, the full product story, and scaling to private AI cloud.
Need help? Email support@flex.ai, join our Slack community, or check status.flex.ai.