> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flex.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Inference-ready Checkpoints

> Mark and manage checkpoints ready for inference deployment

FlexAI Checkpoints can be used not only to resume Training and Fine-tuning jobs but also to deploy Inference Endpoints.

When a Checkpoint is marked as **Inference Ready**, it means that the Checkpoint contains all the necessary files and metadata required to deploy an Inference Endpoint directly from it.

## Checkpoint Metadata Extraction Process

Currently, the FlexAI Checkpoint Manager automatically extracts metadata from Checkpoints created using the Hugging Face Transformers library. This metadata is used to determine if a Checkpoint contains the necessary information to be marked as **Inference Ready** and be deployed as an Inference Endpoint:

### Hugging Face Transformers Checkpoints

Currently, the FlexAI runtime supports Hugging Face Transformers checkpoints, which include the `trainer_state.json` and `config.json` files that contain metadata about the training process and model configuration:

* `STEP`, `TRAIN LOSS` & `EVAL LOSS`: Extracted from `trainer_state.json`'s `log_history` field (last entry).
* `MODEL`: Determined from `config.json`'s `architectures` field.
* `VERSION`: Retrieved from `config.json`'s `transformers_version` field.
* `INFERENCE READY`: Set to `true` if the `architectures` field is present in `config.json`.

## Deploying an Inference-ready Checkpoint

To deploy an Inference Endpoint from an Inference-ready Checkpoint, follow these steps:

<Tabs>
  <Tab title="Using the FlexAI Console">
    Available soon. For now, please use the FlexAI CLI.
  </Tab>

  <Tab title="Using the FlexAI CLI">
    <Steps>
      1. List the Checkpoints associated with a Training or Fine-tuning job.

         ```bash title="Listing Checkpoints for a Training Job" theme={null}
         flexai training checkpoints <training_job_name>
         ```

         Which will return an output similar to:

         ```text title="List of Checkpoints for a Training Job" {5} theme={null}
             ID                               │ NAME              │ NODE │ STEP │ TRAIN LOSS │ EVAL LOSS │ MODEL              │ VERSION │ INFERENCE READY │ TIMESTAMP
         ─────────────────────────────────────┼───────────────────┼──────┼──────┼────────────┼───────────┼────────────────────┼─────────┼─────────────────┼──────────────────────────
         a494f07f-e183-4a53-a6e6-e7116ca177fd │ checkpoint-250    │ 0    │ 250  │ 0.8438     │           │                    │         │ false           │ 2025-09-25 02:39:42 (1d)
         3784735b-d7b6-4978-bd76-c6e9158d2ecc │ checkpoint-300    │ 0    │ 300  │ 0.7895     │           │                    │         │ false           │ 2025-09-25 02:41:02 (1d)
         7f7fe96a-c649-4c94-bfc7-218e17d392ba │ hf_checkpoint     │ 0    │ 300  │ 0.7895     │           │ MistralForCausalLM │ 4.44.2  │ true            │ 2025-09-25 02:41:02 (1d)
         ```
      2. Create an Inference Endpoint using the `flexai inference serve` command, specifying the Checkpoint's name or UUID with the `--checkpoint` flag.
         Example:

         ```bash title="Creating an Inference Endpoint from a Checkpoint" {2} theme={null}
         flexai inference serve <inference_endpoint_name> \
           --checkpoint <checkpoint_name_or_uuid> \
           [<other_inference_args> ...] \
           -- [<vLLM_specific_args> ...]
         ```

         Which can look like:

         ```bash title="Creating an Inference Endpoint from a Checkpoint" {2} theme={null}
         flexai inference serve mistral7b-inference \
           --checkpoint 7f7fe96a-c649-4c94-bfc7-218e17d392ba \
           --model-type mistral \
           --accels 2
         ```
    </Steps>
  </Tab>
</Tabs>

<Card title="Want to learn more about FlexAI Inference" href="/core-services/inference">
  Check out the FlexAI Inference Endpoints documentation for more details.
</Card>
