Skip to main content
Allows for the definition of scaling policies for an Inference Endpoint.

Usage

flexai inference scale <inference_endpoint_name> [flags]

Arguments

ArgumentTypeRequiredDescription
inference_endpoint_namestringYesThe name of the Inference Endpoint to scale.

Flags

FlagShortTypeDescription
--help-hbooleanDisplays this help page.
--max-replicasintegerThe maximum number of replicas to scale to.
--min-replicasintegerThe minimum number of replicas to maintain.
--verbose-vbooleanProvides more detailed output when scaling an Inference Endpoint.