Command: checkpoint
Checkpoints are a FlexAI entity that represents a snapshot of a model’s state at a given point in time.
Checkpoints capture a model’s state at various stages of training. These snapshots include model weights, optimizer state, and other relevant training data. This allows you to resume training from a specific point, preventing data loss and enabling experimentation with different training paths while helping you avoid unnecessarily repeating training iterations.
Checkpoints can be pushed to FlexAI directly from the host machine running the FlexAI CLI or from a Remote Storage Provider Connection, such as Amazon S3, Cloudflare R2, or GCP Cloud Storage, among others. They can be individual files or entire directories.
You will find more information about Managed Checkpoints and the benefits they bring to your AI training workflows in the Managed Checkpoints page.
You can manage Checkpoints using the flexai checkpoint
set of subcommands.
Available subcommands
Section titled “Available subcommands”flexai checkpoint delete
- Deletes a Checkpoint.flexai checkpoint export
- Exports a Checkpoint to a Storage Provider.flexai checkpoint fetch
- Fetches a Checkpoint.flexai checkpoint inspect
- Inspects a Checkpoint’s metadata.flexai checkpoint list
- Lists Checkpoints.flexai checkpoint push
- Pushes a Checkpoint to FlexAI.