> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flex.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Billing & Quotas

> Free credit, how we price requests, rate limits, and what happens when your balance runs out.

## Getting started

Every new account starts with **\$10 of free credit** to evaluate the models. To create API keys you first add a billing address and a credit card from the [dashboard](https://tokens.flex.ai/dashboard/billing) — see [Adding a card](#adding-a-card). You're billed per token at per-model rates, drawing down your free credit first; once it's used, top up to keep requests flowing.

## How we bill

Dollar-denominated, per-request. We don't sell token bundles, we don't subscribe you — we meter what you use:

| Model type                            | Unit                                                 |
| ------------------------------------- | ---------------------------------------------------- |
| Text / chat / completions / reasoning | Per million input tokens, per million output tokens. |
| Embeddings                            | Per million input tokens.                            |

Per-model pricing is in [the catalog](https://flex.ai/models). Spend appears in the dashboard within a few minutes of each request.

<Note>
  Streaming a chat completion and stopping early still bills for the tokens that were generated before you closed the connection — the model did the work. Set `max_tokens` to cap worst-case spend.
</Note>

## Rate limits

Per-API-key, per-minute. Limits are set by tier:

| Tier                     | Requests per minute                                        |
| ------------------------ | ---------------------------------------------------------- |
| Free (default on signup) | 10                                                         |
| Paid                     | 100                                                        |
| Custom                   | [Contact FlexAI](mailto:support@flex.ai) for higher limits |

The tier is stored on the key. If you need more than the free tier, request an upgrade from the dashboard or email [support@flex.ai](mailto:support@flex.ai).

Every response from the API carries rate-limit headers:

| Header                           | Meaning                                |
| -------------------------------- | -------------------------------------- |
| `x-ratelimit-limit-requests`     | RPM ceiling for this key.              |
| `x-ratelimit-remaining-requests` | Requests left in the current window.   |
| `x-ratelimit-reset-requests`     | Unix timestamp when the window resets. |

When you exceed the limit you get `429 Too Many Requests` with a `Retry-After` header (seconds to wait). Don't retry in a tight loop — back off until the header's deadline.

## When credit runs out

Our API returns `402 Payment Required` with this body:

```json theme={null}
{
  "error": {
    "message": "Insufficient credit. Add funds at https://tokens.flex.ai/dashboard/billing to continue.",
    "type": "insufficient_quota",
    "code": "402"
  }
}
```

Once you add credit, subsequent requests succeed immediately — no key regeneration needed.

## Adding a card

From [dashboard/billing](https://tokens.flex.ai/dashboard/billing):

1. Click **Add payment method**.
2. Enter card details (processed by Stripe; we never see the number).
3. Top up by any amount ≥ \$5. The balance is added to your account immediately.

There's no auto-recharge at launch — you'll get an email when your balance drops below $5 and again when it hits $0.