Monitoring a Fine-tuning Job's Progress
Logs
You can select the gear icon ⚙️ (labeled as Configure) in the Actions field of the Training or Fine-tuning Jobs list page. This will open a “Details” panel.
Select the Logs tab to view the messages emitted to the standard output (stdout) created by the runtime environment build process and your own code.
You can use the Search bar input field to filter the logs by a specific keyword. This is useful to quickly find relevant information in the logs.
Once a Training Job begins, logs emitted to the standard output (stdout) created by the runtime environment build process and your own code can be retrieved by running the flexai training logs command:
flexai training logs quickstart-training-jobThis will output a stream of logs including both the FlexAI runtime execution logs and your code’s messages:
Infrastructure Metrics
You can monitor the infrastructure metrics of your Fine-tuning Job using the FlexAI Infrastructure Monitor. This will give you insights into the resource usage of your Fine-tuning Job, such as CPU and memory usage, disk I/O, and network traffic.
Access FlexAI's Infrastructure Monitor by visiting https://dashboards.flex.ai/. Visit the FlexAI Infrastructure Monitor page to learn more.
TensorBoard
You can also use FlexAI's hosted TensorBoard to visualize the fine-tuning process of your model. TensorBoard provides a suite of tools for inspecting and understanding your Fine-tuning Job's evolution.
Visit https://dashboards.flex.ai/tensorboard and log in using your credentials. Learn more at the FlexAI TensorBoard page.
Next Steps
After a few minutes, your Fine-tuning Job should have completed successfully!
The next step of this Quickstart Tutorial will guide you through the process of getting its outputs or results.