Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.flex.ai/llms.txt

Use this file to discover all available pages before exploring further.

Allows for the definition of scaling policies for an Inference Endpoint.

Usage

flexai inference scale <inference_endpoint_name> [flags]

Arguments

ArgumentTypeRequiredDescription
inference_endpoint_namestringYesThe name of the Inference Endpoint to scale.

Flags

FlagShortTypeDescription
--help-hbooleanDisplays this help page.
--max-replicasintegerThe maximum number of replicas to scale to.
--min-replicasintegerThe minimum number of replicas to maintain.
--verbose-vbooleanProvides more detailed output when scaling an Inference Endpoint.