Number of accelerators/GPUs to use.
training debug-ssh
Starts an Interactive Training Job that allows connecting through SSH or VSCode to the Training Runtime, useful for fast test iterations.
The —vscode flag is optional, but highly recommended to leverage the full potential of the Interactive Training Runtime.
  flexai training debug-ssh \  --repository-url <repository_url> \  [--dataset <dataset_name>...] \  [--repository-revision <branch_name>] \  [--checkpoint <checkpoint_id>] \  [--env <env_var>=<value>...] \  [--secret <env_var>=<secret_value>...] \  [--nodes <node_count>] \  [--accels <accelerator_count>] \  [--device-arch <device_architecture>] \  [--git-author-name <git_author_name>] \  [--git-author-email <git_author_email>] \  [--dotfiles <dotfiles_repository>] \  [--authorized-keys <ssh_public_key>] \  [--session-timeout <timeout_in_seconds>] \  [--vscode]- -a , --accels
 - --affinity
 -  
Affinity rules for the workload.
 -  
List of SSH public keys to allow connecting to the interactive environment. If not provided, keys will be gathered from the local ssh-agent, if available.
 - --build-secret
 -  
FlexAI Secrets to make available during the image build process. Format:
<flexai_secret_name>=<environment_variable_name> -   
--build-secret build_config_secret=SECRET_ENV_VAR_TO_USE - -C , --checkpoint
 -  
A Checkpoint to serve mount on the runtime environment.
The name of a previously pushed Checkpoint. Use
flexai checkpoint listto see available Checkpoints. -   
--checkpoint Mixtral-8x7B-v0_1 -   
--checkpoint gemma-3n-E4B-it -  
The UUID of an Inference Ready Checkpoint generated during the execution of a Training or Fine-tuning job. Use
flexai training checkpointsto see available Checkpoints. -   
--checkpoint 3fa85f64-5717-4562-b3fc-2c963f66afa6 - -D , --dataset
 -  
Dataset to mount on the runtime environment
 -   
--dataset open_web -   
--dataset fineweb-edu -  
Datasets to mount on the runtime environment using a custom mount path
 -   
--dataset open_web=data/train/ow --dataset fineweb-edu=/data/train/fineweb-edu - -d , --device-arch
 -  
The architecture of the device to run the Inference Endpoint on.
One of:
nvidiaamdtt
 -   
--device-arch nvidia - --dotfiles
 -  
Github dotfiles repository URL that will be installed in the home directory of the interactive environment.
 -   
--dotfiles https://github.com/funnierinspanish/dotfiles.git - -E , --env
 -  
Environment variables to set in the interactive environment.
 -   
--env WANDB_ENTITY=georgec123 --env WANDB_PROJECT=gppt-j -  
The Git commit author email to use in the interactive training environment.
 -   
diego@flex.ai -   
george@vandelay-industries.biz -  
The Git commit author name to use in the interactive training environment.
 - -h , --help
 -  
Displays this help page.
 - --no-queuing
 -  
Disables queuing for this workload: If no resources are available, the workload will fail immediately instead of waiting for resources to become available.
 - -n , --nodes
 -  
The number of nodes across which to distribute the workload.
Selecting more than 1 node will overwrite the value provided in the
—accelsflag to 8 accelerator per node. -   
--nodes 1 -   
--nodes 4 - -b , --repository-revision
 -  
The branch name of the repository.
mainby default. -   
--repository-revision secondary -   
--repository-revision testing -  
A commit SHA hash to use.
 -   
--repository-revision 9fceb02 -   
--repository-revision e5bd391 -  
A tag name to use.
 -   
--repository-revision v1.0.0 -   
--repository-revision release-2024 - -u , --repository-url
 -  
Git repository URL containing code to mount on the workload environment.
Will be mounted on the
/workspacedirectory. -   
--repository-url https://github.com/flexaihq/nanoGPT/ -   
--repository-url https://github.com/flexaihq/nanoGPT.git - -q , --requirements-path
 -  
Path to a pip requirements.txt file in the repository.
 -   
--requirements-path code/project/requirements.txt - -r , --runtime
 -  
Name of the runtime to use
 - -S , --secret
 -  
Environment variables that will be set in the Training or Fine-tuning Runtime.
Secrets are sensitive values like API keys, tokens, or credentials that need to be accessed by your Training Job but should not be exposed in logs or command history. When using the —secret flag, the actual secret values are retrieved from the Secrets Storage and injected into the environment at runtime.
Syntax:
<env_var_name>=<flexai_secret_name>
Where
<env_var_name>is the name of the environment variable to set, and<flexai_secret_name>is the name of the Secret containing the sensitive value. -   
--secret HF_TOKEN=hf-token-dev -   
--secret WANDB_API_KEY=wandb-key - --session-timeout
 -  
Timeout in seconds after which the interactive training session will be stopped if no activity is detected.
 -   
--session-timeout 666 -   
--session-timeout 3600 - -v , --verbose
 -  
Provides more detailed output when running a debug-ssh session.
 - --vscode
 -  
Opens the Visual Studio Code editor connected to the runtime environment via SSH. If not installed, the runtime will still be started and accessible via SSH.
 
1  gathered from ssh-agent  Examples
Examples
Examples
Examples
Examples
nvidia  Examples
Examples
Examples
Git config user.email value  Examples
Git config user.name value  1  Examples
main  Examples
Examples
Examples
Examples
Examples
Examples
600