Skip to content

Checking the Training Job's Details

You can select the gear icon ⚙️ (labeled as Configure) in the Actions field of the Training Jobs list page. This will open a “Details” drawer. The Details tab will be selected by default, showing all the relevant information about your Training Job.

FieldDescription
NameThe name you assigned to the Training Job.
StatusThe current status of the Training Job (e.g., pending, scheduling, building, in progress, succeeded, failed, stopped, etc.).
Created AtThe timestamp when the Training Job was created.
FieldDescription
Dashboard URLThe URL of the Training Job dashboard, where you can monitor the performance and resource usage of your Training Job.
Tensorboard Dashboard URLThe URL of the FlexAI-hosted TensorBoard dashboard, where you can visualize the training process of your models.
Node CountThe number of nodes allocated to the Training Job.
Accelerator CountThe number of accelerators (GPUs) allocated to the Training Job.
Repository URLThe URL of the Git repository containing your training code.
Repository RevisionThe specific commit or branch of the repository that was used to create the Training Job.
Repository Revision SHAThe SHA hash of the specific commit or branch of the repository that was used to create the Training Job.
Entry PointThe entry point script along with its arguments.
DatasetsThe datasets that were attached to the Training Job.
EnvironmentThe environment variables and secrets that were set for the Training Job. Displayed in a Key-Value pai format where the Key is the name of the environment value within the Training Runtime, and the value is either the raw value (for Environment Variables) or the name of the FlexAI secret containing the secret value.
CheckpointsThe checkpoints that were created during the Training Job. These are stored in the FlexAI object storage and can be used to resume training or to create an Inference Endpoint (depending on the type of model).

The Details drawer also contains a Logs tab, which provides you with real-time logs from your Training Job, allowing you to monitor its activity and troubleshoot any issues that may arise. Check the Monitoring a Training Job’s Progress page for more information on how to use the logs and further monitoring options.