training run

-a , --accels

<integer>

Optional

Default Value: 1

Integer

Number of accelerators to use for the workload.

Examples

--accels 4

-b , --repository-revision

<string>

Optional

Default Value: main

String

The branch name of the repository.

Examples

--repository-revision main

UUID

The commit hash of the repository.

Examples

--repository-revision 53f6b645fc5d039152aef884def64288e3eeb56b

String

The tag name of the repository.

Examples

--repository-revision v1.0.0

-C , --checkpoint

<string>

Optional

The identifier of a Checkpoint to use as the starting point for the Training Job.

Resource Name

The ID of the Checkpoint generated during Training Job’s execution (see flexai training checkpoints).

Examples

--checkpoint mistral-500-checkpoint

UUID

The name of a user-provided Checkpoint (see flexai checkpoint).

Examples

--checkpoint a1b18a7f-9b85-4c74-91a9-6aca526e8ce4

-D , --dataset

<string><key=value>

Required

A Dataset name or a key=value pair representing a Dataset and a custom mount point on the Training Runtime.

Multiple Datasets can be used within a single Training Job. Depending on which value format is passed (Resource Name or Key Value Path Mapping), they will be mounted to either of:

/input/<dataset_name>
/input/<dataset_mount_path>

See the available value format options below.

Resource Name

The ID of a Dataset (see flexai dataset list).

Examples

--dataset wikitext-2-raw-v1

Key Value Path Mapping

A key=value pair representing a Dataset to use and its destination mount path on the Training Runtime.

Syntax:

<dataset_name>=<dataset_mount_path>

Examples

--dataset wikitext-2-raw-v1=/wikitext2/v1

-d , --device-arch

<option_list>

Optional

Default Value: nvidia

Option list

nvidia

Examples

--device-arch nvidia

-E , --env

<key=value>

Optional

Key Value Mapping

Environment variables that will be set in the Training Runtime.

Examples

--env BATCH_SIZE=32
--env WANDB_PROJECT=my-project-123

-n , --nodes

<integer>

Optional

Default Value: 1

Integer

Number of nodes to use for the workload.

Examples

--nodes 4

-q , --requirements-path

<string>

Optional

Default Value: ./

String

Path to the requirements.txt file that will be used to install the dependencies in the Training Runtime.

This path is relative to the root of the repository (specified by the --repository-url flag).

Examples

--requirements-path path/to/requirements.txt

-S , --secret

<key=value>

Optional

Key Value Mapping

Environment variables that will be set in the Training Runtime. The values of these variables are the names of Secrets (see flexai secret list).

Secrets are sensitive values like API keys, tokens, or credentials that need to be accessed by your Training Job but should not be exposed in logs or command history. When using the --secret flag, the actual secret values are retrieved from the Secrets Storage and injected into the environment at runtime.

Syntax:

<env_var_name>=<secret_name>

Where <env_var_name> is the name of the environment variable to set, and <secret_name> is the name of the Secret to use as the value.

Examples

--secret HF_TOKEN=hf-token-dev
--secret WANDB_API_KEY=wandb-key

-u , --repository-url

<string>

Required

Git Repository

The URL of the Git repository containing the training code.

Examples

--repository-url https://github.com/flexaihq/nanoGPT/
--repository-url https://github.com/flexaihq/nanoGPT.git

training run

Arguments

Examples

Examples

Examples

Examples

Examples

Flags

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples