Changelog: 2025-05-19

Changed

FlexAI Inference vLLM Arguments: The flexai inference serve command’s optional vLLM_arguments have switched from a “Supported argument list” schema to a “Unsupported argument list” schema. This enables you to safely pass most of vLLM Engine Arguments 🔗 when creating an Inference Endpoint.

Currently, the only unsupported argument is --device, which value is handled by FlexAI.