Skip to main content

Changed

  • FlexAI Inference vLLM Arguments: The flexai inference serve command’s optional vLLM_arguments have switched from a “Supported argument list” schema to a “Unsupported argument list” schema. This enables you to safely pass most of vLLM Engine Arguments when creating an Inference Endpoint. Currently, the only unsupported argument is --device, which value is handled by FlexAI.