You're ready to get started!
You’ve learned how to upload a Dataset and then use it to run a Training Job using training code hosted on a public GitHub repository.
You can retrieve checkpoints generated by FlexAI’s Managed Checkpoints at any point, which allows you to go back to a previous point in the past to resume training, to test your model, or to use it for inference.
You can list all available checkpoints for a specific Training Job by running the flexai training checkpoints
command:
flexai training checkpoints quickstart-training-job
This will return a table with a list of Checkpoint IDs and their corresponding creation timestamps, similar to the following:
ID │ TIMESTAMP──────────────────────────────────────┼──────────────────────────────────── 50e5ec69-32b6-e483-9c49-38a73cc34294 │ 2025-06-30 12:42:55.214 +0100 WEST 82d21263-8ba8-dd73-9c61-732d3b7b0adc │ 2025-06-30 12:43:01.77 +0100 WEST 32d07a60-61cc-4598-b4f6-2073a4f8d0af │ 2025-06-30 12:43:14.734 +0100 WEST
Once you have the desired Checkpoint ID, you can download it to your host machine using the flexai checkpoint fetch
command:
flexai checkpoint fetch 32d07a60-61cc-4598-b4f6-2073a4f8d0af
Writing in: /home/diego/ckpt.ptProgress: 0.4% (1.31 MB / 343.79 MB)// ...Progress: 100% (343.79 MB / 343.79 MB)
You can use this checkpoint file to resume training from the exact point it was saved, or to evaluate the model’s performance on a validation dataset.
Any data written to the /output
directory will be compressed into a zip file and made available to you via the flexai training fetch
command:
flexai training fetch quickstart-training-job
This will download a .zip
file to the current working directory on your host machine.
Once extracted you’ll get a local directory named output
it will contain any files written to the /output
directory by the training scripts.
You're ready to get started!
You’ve learned how to upload a Dataset and then use it to run a Training Job using training code hosted on a public GitHub repository.
You now have the knowledge required to create run your own Training Jobs on FlexAI by integrating your own public or private Code Repositories and loading your datasets.
Private Code Repositories
You can use any public or private GitHub repository as the source of your training code when using the --repository-url
flag.
However, you can also use the flexai code-registry
command to connect your GitHub account to FlexAI and use any of your private repositories as well.
Dataset Upload
FlexAI makes it easy to upload Datasets from your host machine through the flexai dataset push
command.
But wait, there’s more! You can also push Datasets from remote sources, such as S3, GCS, MinIO or R2.
Interactive Training
With FlexAI you can run an “Interactive Training Job session” that allows you SSH into a Training Environment where you have access to the entire system by using the flexai training debug-ssh
command.
This is useful for debugging and testing purposes, allowing you to test your training code in the environment it’ll be running on, reducing iteration times.
CLI Command Reference
Explore the CLI Command Reference pages to learn about all the ways you can use the FlexAI CLI to manage your workloads.
You will find a page for each CLI Command along with each of its subcommands, example usage, recommendations, flags you can use, output messages, and more!