Skip to content

Inference-ready Checkpoints

FlexAI Checkpoints can be used not only to resume Training and Fine-tuning jobs but also to deploy Inference Endpoints.

When a Checkpoint is marked as Inference Ready, it means that the Checkpoint contains all the necessary files and metadata required to deploy an Inference Endpoint directly from it.

Currently, the FlexAI Checkpoint Manager automatically extracts metadata from Checkpoints created using the Hugging Face Transformers library. This metadata is used to determine if a Checkpoint contains the necessary information to be marked as Inference Ready and be deployed as an Inference Endpoint:

Currently, the FlexAI runtime supports Hugging Face Transformers checkpoints, which include the trainer_state.json and config.json files that contain metadata about the training process and model configuration:

  • STEP, TRAIN LOSS & EVAL LOSS: Extracted from trainer_state.json’s log_history field (last entry).
  • MODEL: Determined from config.json’s architectures field.
  • VERSION: Retrieved from config.json’s transformers_version field.
  • INFERENCE READY: Set to true if the architectures field is present in config.json.

To deploy an Inference Endpoint from an Inference-ready Checkpoint, follow these steps:

Available soon. For now, please use the FlexAI CLI.