The name of the Training Job to inspect.
Examples
-
gpt2training-1
Displays detailed information about a Training Job, including its name, ID, creator, owner, configuration, status, and execution details.
flexai training inspect <training_job_name> [--json]
The name of the Training Job to inspect.
gpt2training-1
Output the information in JSON format.
--json
flexai training inspect quickstart-training-job
Which will output a detailed view of the Training Job in YAML:
kind: Trainingmetadata: name: quickstart-training-job id: 75179cc2-ec63-4f93-b4da-44e49ea86049 creatorUserID: 16e2894c-c81b-4a15-91d9-0e2aae00a317 ownerOrgID: 108dddec-e922-49b8-a466-4d7ed5dcc746config: device: nvidia nodes: 1 accelerator: 1 entrypoint: - train.py - config/train_shakespeare_char.py - --out_dir=/output-checkpoint - --max_iters=1500 datasetsNames: - nanoGPT-dataset checkpointName: "" sourceName: "" repositoryURL: https://github.com/flexaihq/nanogpt repositoryRevision: main secrets: [] environment: []runtime: status: succeeded queuePosition: 0 repositoryRevisionSha: 116799dbae7b0fe33caf1b90f73a72f84bc32adc selectedAgentId: k8s-training-sesterce-001-CLIENT-PROD-client-prod lifecycleEvents: - type: AgentSelection status: ResponseReceived message: |- Cluster Scheduling result{ Name: aws-cloud AgentID: k8s-training-aws-001-CLIENT-PROD-client-prod Response: NoAnswer Conditions: [NonSchedulable: NoAnswer] } raisedAt: "2025-06-30T11:41:54Z" - type: AgentSelection status: ResponseReceived message: |- Cluster Scheduling result{ Name: sesterce-h100-bm-01 AgentID: k8s-training-sesterce-001-CLIENT-PROD-client-prod Response: OK Conditions: [] } raisedAt: "2025-06-30T11:41:54Z" - type: AgentSelection status: ResponseReceived message: |- Cluster Scheduling result{ Name: sesterce-h200-bm-01 AgentID: k8s-training-sesterce-002-CLIENT-PROD-client-prod Response: NoAnswer Conditions: [NonSchedulable: NoAnswer] } raisedAt: "2025-06-30T11:41:54Z" - type: AgentSelection status: ResponseReceived message: |- Cluster Scheduling result{ Name: sesterce-l40s-bm-01 AgentID: k8s-training-sesterce-003-CLIENT-PROD-client-prod Response: NoAnswer Conditions: [NonSchedulable: NoAnswer] } raisedAt: "2025-06-30T11:41:54Z" - type: AgentSelection status: ResponseReceived message: |- Cluster Scheduling result{ Name: sesterce-a100-bm-01 AgentID: k8s-training-sesterce-004-CLIENT-PROD-client-prod Response: NoAnswer Conditions: [NonSchedulable: NoAnswer] } raisedAt: "2025-06-30T11:41:54Z" - type: AgentSelection status: ResponseReceived message: |- Cluster Scheduling result{ Name: k8s-training-smc-001 AgentID: k8s-training-smc-001-CLIENT-PROD-client-prod Response: NoAnswer Conditions: [NonSchedulable: NoAnswer, OrgNotAuthorized] } raisedAt: "2025-06-30T11:41:54Z" - type: AgentSelection status: Completed message: Selected agent k8s-training-sesterce-001-CLIENT-PROD-client-prod raisedAt: "2025-06-30T11:41:54Z" - type: BuildSubmission status: Succeeded message: Build request sent to flex-agent raisedAt: "2025-06-30T11:41:54Z" - type: BuildExecution status: Succeeded message: Build completed with image rg.fr-par.scw.cloud/paas-trainings-client-prod/9f9c379c-8d46-419b-8bf5-d0b0986a6dd9-arch_nvidia-1x1@sha256:0d854f75f698a549d2a8a0e024e930383b885bdac2863ee0cf74ebdc8a8f358c raisedAt: "2025-06-30T11:41:54Z" - type: TrainingPreparation status: Succeeded message: Training trainings-client-prod/training-75b79cc2-ec63-4f93-b4da-44e49a4a6049-zqg6d created raisedAt: "2025-06-30T11:41:54Z" - type: TrainingExecution status: InProgress message: Training in progress raisedAt: "2025-06-30T11:42:00Z" - type: TrainingExecution status: Succeeded message: Training complete, output available raisedAt: "2025-06-30T11:43:48Z" createdAt: "2025-06-30T11:41:54Z" lastUpdate: "2025-06-30T11:43:48Z"
flexai training inspect quickstart-training-job --json
Which will provide the same information, but in JSON format:
{ "kind": "Training", "metadata": { "name": "quickstart-training-job", "id": "75179cc2-ec63-4f93-b4da-44e49ea86049", "creatorUserID": "16e2894c-c81b-4a15-91d9-0e2aae00a317", "ownerOrgID": "108dddec-e922-49b8-a466-4d7ed5dcc746" }, "config": { "device": "nvidia", "nodes": 1, "accelerator": 1, "entrypoint": [ "train.py", "config/train_shakespeare_char.py", "--out_dir=/output-checkpoint", "--max_iters=1500" ], "datasetsNames": [ "nanoGPT-dataset" ], "checkpointName": "", "sourceName": "", "repositoryURL": "https://github.com/flexaihq/nanogpt", "repositoryRevision": "main", "secrets": [], "environment": [] }, "runtime": { "status": "succeeded", "queuePosition": 0, "repositoryRevisionSha": "116799dbae7b0fe33caf1b90f73a72f84bc32adc", "selectedAgentId": "k8s-training-sesterce-001-CLIENT-PROD-client-prod", "lifecycleEvents": [ { "type": "AgentSelection", "status": "ResponseReceived", "message": "Cluster Scheduling result{\n Name: aws-cloud\n AgentID: k8s-training-aws-001-CLIENT-PROD-client-prod\n Response: NoAnswer\n Conditions: [NonSchedulable: NoAnswer]\n}", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "AgentSelection", "status": "ResponseReceived", "message": "Cluster Scheduling result{\n Name: sesterce-h100-bm-01\n AgentID: k8s-training-sesterce-001-CLIENT-PROD-client-prod\n Response: OK\n Conditions: []\n}", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "AgentSelection", "status": "ResponseReceived", "message": "Cluster Scheduling result{\n Name: sesterce-h200-bm-01\n AgentID: k8s-training-sesterce-002-CLIENT-PROD-client-prod\n Response: NoAnswer\n Conditions: [NonSchedulable: NoAnswer]\n}", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "AgentSelection", "status": "ResponseReceived", "message": "Cluster Scheduling result{\n Name: sesterce-l40s-bm-01\n AgentID: k8s-training-sesterce-003-CLIENT-PROD-client-prod\n Response: NoAnswer\n Conditions: [NonSchedulable: NoAnswer]\n}", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "AgentSelection", "status": "ResponseReceived", "message": "Cluster Scheduling result{\n Name: sesterce-a100-bm-01\n AgentID: k8s-training-sesterce-004-CLIENT-PROD-client-prod\n Response: NoAnswer\n Conditions: [NonSchedulable: NoAnswer]\n}", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "AgentSelection", "status": "ResponseReceived", "message": "Cluster Scheduling result{\n Name: k8s-training-smc-001\n AgentID: k8s-training-smc-001-CLIENT-PROD-client-prod\n Response: NoAnswer\n Conditions: [NonSchedulable: NoAnswer, OrgNotAuthorized]\n}", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "AgentSelection", "status": "Completed", "message": "Selected agent k8s-training-sesterce-001-CLIENT-PROD-client-prod", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "BuildSubmission", "status": "Succeeded", "message": "Build request sent to flex-agent", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "BuildExecution", "status": "Succeeded", "message": "Build completed with image rg.fr-par.scw.cloud/paas-trainings-client-prod/9f9c379c-8d46-419b-8bf5-d0b0986a6dd9-arch_nvidia-1x1@sha256:0d854f75f698a549d2a8a0e024e930383b885bdac2863ee0cf74ebdc8a8f358c", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "TrainingPreparation", "status": "Succeeded", "message": "Training trainings-client-prod/training-75b79cc2-ec63-4f93-b4da-44e49a4a6049-zqg6d created", "raisedAt": "2025-06-30T11:41:54Z" }, { "type": "TrainingExecution", "status": "InProgress", "message": "Training in progress", "raisedAt": "2025-06-30T11:42:00Z" }, { "type": "TrainingExecution", "status": "Succeeded", "message": "Training complete, output available", "raisedAt": "2025-06-30T11:43:48Z" } ], "createdAt": "2025-06-30T11:41:54Z", "lastUpdate": "2025-06-30T11:43:48Z" }}