Changed
-
FlexAI Inference vLLM Arguments: The
flexai inference servecommand’s optionalvLLM_argumentshave switched from a “Supported argument list” schema to a “Unsupported argument list” schema. This enables you to safely pass most of vLLM Engine Arguments when creating an Inference Endpoint. Currently, the only unsupported argument is--device, which value is handled by FlexAI.