Skip to main content
Starts an Interactive Training Job that allows connecting through SSH or VSCode to the Training Runtime, useful for fast test iterations. The --vscode flag is optional, but highly recommended to leverage the full potential of the Interactive Training Runtime.

Usage

flexai training debug-ssh \
  --repository-url <repository_url> \
  [--dataset <dataset_name>...] \
  [--repository-revision <branch_name>] \
  [--checkpoint <checkpoint_id>] \
  [--env <env_var>=<value>...] \
  [--secret <env_var>=<secret_value>...] \
  [--nodes <node_count>] \
  [--accels <accelerator_count>] \
  [--device-arch <device_architecture>] \
  [--git-author-name <git_author_name>] \
  [--git-author-email <git_author_email>] \
  [--dotfiles <dotfiles_repository>] \
  [--authorized-keys <ssh_public_key>] \
  [--session-timeout <timeout_in_seconds>] \
  [--vscode]

Flags

FlagShortTypeDefaultDescription
--accels-ainteger1Number of accelerators/GPUs to use.
--affinitykey=valueAffinity rules for the workload.
--authorized-keysstringssh-agent keysList of SSH public keys to allow connecting to the interactive environment. If not provided, keys will be gathered from the local ssh-agent, if available.
--build-secretkey=valueFlexAI Secrets to make available during the image build process. Format: <flexai_secret_name>=<environment_variable_name>
--checkpoint-CstringA Checkpoint to mount on the runtime environment. Can be a name or UUID.
--dataset-DstringDataset to mount on the runtime environment. Can be repeated for multiple datasets. Optionally specify mount path with name=path format.
--device-arch-dstringnvidiaThe architecture of the device to run on.
--dotfilesstringGitHub dotfiles repository URL that will be installed in the home directory of the interactive environment.
--env-Ekey=valueEnvironment variables to set in the interactive environment. Can be repeated.
--git-author-emailstringgit configThe Git commit author email to use in the interactive training environment.
--git-author-namestringgit configThe Git commit author name to use in the interactive training environment.
--help-hbooleanDisplays this help page.
--no-queuingbooleanDisables queuing for this workload: If no resources are available, the workload will fail immediately.
--nodes-ninteger1The number of nodes across which to distribute the workload. Selecting more than 1 node will set 8 accelerators per node.
--repository-revision-bstringmainThe branch name, commit SHA, or tag of the repository to use.
--repository-url-ustringGit repository URL containing code to mount on the workload environment. Will be mounted on /workspace.
--requirements-path-qstringPath to a pip requirements.txt file in the repository.
--runtime-rstringName of the runtime to use.
--secret-Skey=valueEnvironment variables set from FlexAI Secrets. Format: <env_var_name>=<flexai_secret_name>. Can be repeated.
--session-timeoutinteger600Timeout in seconds after which the interactive session will be stopped if no activity is detected.
--verbose-vbooleanProvides more detailed output when running a debug-ssh session.
--vscodebooleanOpens Visual Studio Code connected to the runtime environment via SSH.