Skip to main content

Latest Release: 2025-05-19

Changed

  • FlexAI Inference vLLM Arguments: The flexai inference serve command's optional vLLM_arguments have switched from a "Supported argument list" schema to a "Unsupported argument list" schema. This enables you to safely pass most of vLLM Engine Arguments when creating an Inference Endpoint.

    Currently, the only unsupported argument is --device, which value is handled by FlexAI.