> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flex.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Billing & Quotas

> Free credit, how we price requests, rate limits, and what happens when your balance runs out.

## Free credit

Every new account is granted **\$10 of free credit** on signup. No card required — you can evaluate text, vision, and embedding models before paying anything.

When the free credit runs out, your next API call returns `402 Payment Required`. Add a card from the [dashboard](https://tokens.flex.ai/dashboard/billing) to continue; any credit you add is spent down at the same per-model rates.

## How we bill

Dollar-denominated, per-request. We don't sell token bundles, we don't subscribe you — we meter what you use:

| Model type                            | Unit                                                 |
| ------------------------------------- | ---------------------------------------------------- |
| Text / chat / completions / reasoning | Per million input tokens, per million output tokens. |
| Embeddings                            | Per million input tokens.                            |

Per-model pricing is in [the catalog](https://flex.ai/models). Spend appears in the dashboard within a few minutes of each request.

<Note>
  Streaming a chat completion and stopping early still bills for the tokens that were generated before you closed the connection — the model did the work. Set `max_tokens` to cap worst-case spend.
</Note>

## Rate limits

Per-API-key, per-minute. Limits are set by tier:

| Tier                      | Requests per minute |
| ------------------------- | ------------------- |
| Free (default on signup)  | 10                  |
| Elevated (approved users) | 60                  |
| Paid                      | 100                 |

Every new signup is granted **\$10 of free credit**.

The tier is stored on the key. If you need more than the free tier, request an upgrade from the dashboard or email [support@flex.ai](mailto:support@flex.ai).

Every response from the API carries rate-limit headers:

| Header                           | Meaning                                |
| -------------------------------- | -------------------------------------- |
| `x-ratelimit-limit-requests`     | RPM ceiling for this key.              |
| `x-ratelimit-remaining-requests` | Requests left in the current window.   |
| `x-ratelimit-reset-requests`     | Unix timestamp when the window resets. |

When you exceed the limit you get `429 Too Many Requests` with a `Retry-After` header (seconds to wait). Don't retry in a tight loop — back off until the header's deadline.

## When credit runs out

Our API returns `402 Payment Required` with this body:

```json theme={null}
{
  "error": {
    "message": "Insufficient credit. Add funds at https://tokens.flex.ai/dashboard/billing to continue.",
    "type": "insufficient_quota",
    "code": "402"
  }
}
```

Once you add credit, subsequent requests succeed immediately — no key regeneration needed.

## Adding a card

From [dashboard/billing](https://tokens.flex.ai/dashboard/billing):

1. Click **Add payment method**.
2. Enter card details (processed by Stripe; we never see the number).
3. Top up by any amount ≥ \$5. The balance is added to your account immediately.

There's no auto-recharge at launch — you'll get an email when your balance drops below $5 and again when it hits $0.
