> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flex.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# inference scale

> Scale the number of replicas for an inference endpoint

Allows for the definition of scaling policies for an Inference Endpoint.

## Usage

```bash theme={null}
flexai inference scale <inference_endpoint_name> [flags]
```

## Arguments

| Argument                  | Type   | Required | Description                                  |
| ------------------------- | ------ | -------- | -------------------------------------------- |
| `inference_endpoint_name` | string | Yes      | The name of the Inference Endpoint to scale. |

## Flags

| Flag             | Short | Type    | Description                                                       |
| ---------------- | ----- | ------- | ----------------------------------------------------------------- |
| `--help`         | `-h`  | boolean | Displays this help page.                                          |
| `--max-replicas` |       | integer | The maximum number of replicas to scale to.                       |
| `--min-replicas` |       | integer | The minimum number of replicas to maintain.                       |
| `--verbose`      | `-v`  | boolean | Provides more detailed output when scaling an Inference Endpoint. |