Uploading a Dataset

Download The Sample Dataset Files

Using the FlexAI Console
Using the FlexAI CLI

You can upload files or directories from your local machine to create a new Dataset.For this quickstart tutorial, we will use two pre-generated files: train.bin and val.bin that you can download to your machine by using the links below:

train.bin: Training dataset
val.bin: Validation dataset for model evaluation

Download the pre-generated training dataset files for this quickstart tutorial:

curl -L --remote-name-all https://github.com/flexaihq/nanoGPT/raw/refs/heads/prepare_dataset/data/shakespeare_char/dataset/{train.bin,val.bin}

This downloads two binary files:

train.bin: Training dataset (Shakespeare character-level data)
val.bin: Validation dataset for model evaluation

Uploading files to the FlexAI Dataset Manager Service

Using the FlexAI Console
Using the FlexAI CLI

Navigate to the Dataset Manager in the FlexAI Console:

Visit the Add Dataset section

Visit the “Add Dataset” section of the FlexAI Console

Enter a name for your dataset

Enter a name for your dataset: nanoGPT-dataset

Must follow the FlexAI Resource Naming conventions

Select the Local option

Select the “Local” option for “Upload Origin”

Upload Items

Select the + Upload Item button to open the “Upload Items” dialog

Use the “Select file” option to open a file browser dialog:

Select both files

Select both train.bin and val.bin files from your local machine

Depending on your system and browser, you might need to hold down the Ctrl/Cmd key while selecting multiple files
When selecting multiple files, your browser might prompt you with a confirmation message asking you to allow multiple file selection.
You can also select files individually if you prefer

Enter destination path

In the “Destination Path” field, enter shakespeare_char

Add files

Select the Add button to confirm the file selection and destination mapping

The “Upload Items” dialog will close and you will get back to the “Add a Dataset” form, where you will see the files you just added listed under the “Upload Items” section. Here you will see a list of files similar to the one below:

shakespeare_char/
├── train.bin (1.91MB)
└── val.bin (217.85KB)

Below the file list you will find an + Add items button that will open up the “Upload Items” dialog again, in case you want to add more files or directories to the Dataset.Finally, select the Add Dataset button to start the upload process.

Use the flexai dataset push command to create and upload your Dataset:

flexai dataset push nanoGPT-dataset \
  --file train.bin=shakespeare_char/train.bin \
  --file val.bin=shakespeare_char/val.bin

The command above has the following components:

Component	Value	Description
Dataset Name	`nanoGPT-dataset`	Name for the Dataset in FlexAI. Must follow the FlexAI Resource Naming conventions.
File Mapping	`train.bin=shakespeare_char/train.bin`	Maps local file to dataset path
File Mapping	`val.bin=shakespeare_char/val.bin`	Maps local file to dataset path

Dataset structure

Using the FlexAI Console
Using the FlexAI CLI

Once upload is complete, you can select the gear icon ⚙️ (labeled as Configure) in the Actions field of the Dataset list page. This will open the Dataset “Details” panel where you will be able to see the Dataset’s name, status, creation date, and a list of files that were uploaded as part of the Dataset.

Summary

Name: nanoGPT-dataset
Status: Ready
Creation Time: 8/5/2025, 2:13:07 PM

Details

shakespeare_char/
├── train.bin (1.191MB)
└── val.bin (217.85KB)

To learn more about the ways your workloads can access Datasets, check out the Runtime Access section of the Dataset Manager overview page.

Check that your dataset was uploaded successfully:

flexai dataset list

Expected output:

 NAME                │ FILES COUNT │ TOTAL SIZE │ STATUS    │ CREATED AT
─────────────────────┼─────────────┼────────────┼───────────┼────────────────────────────
 nanoGPT-dataset     │ 2           │ 2.13 MB    │ available │ 2025-08-05 13:13:07 (1h)

The status should show available when the upload is complete and the dataset is ready for training.You can run the flexai dataset inspect <DATASET_NAME> command to get more detailed information about your Dataset:

flexai dataset inspect nanoGPT-dataset

Which will output something like:

YAML Output
JSON Output

kind: Dataset
metadata:
  name: nanoGPT-dataset
  id: 1b541f62-2faf-4e32-8fd1-a6bc27e26b58
  creatorUserID: 16e289cc-c81b-4a15-91d9-0e2aae00a317
  ownerOrgId: 270a5476-b91a-442f-8a13-852ef7bb5b9c
spec:
  fromLocalFiles:
    - train.bin
    - val.bin
  storageProvider: ""
  sourcePath: ""
status:
  status: available
  storageProviderID: 00000000-0000-0000-0000-000000000000
  size: 2.13 MB
  files:
    - path: shakespeare_char/train.bin
      size: 1.91 MB
    - path: shakespeare_char/val.bin
      size: 217.85 KB
  createdAt: 2025-08-05 14:13:07 (57d)
  updatedAt: 2025-08-05 14:13:08 (57d)
  dataSyncs: []

{
  "kind": "Dataset",
  "metadata": {
    "name": "nanoGPT-dataset",
    "id": "1b541f62-2faf-4e32-8fd1-a6bc27e26b58",
    "creatorUserID": "15e2894c-c81b-4a15-91d5-0e2aae00a317",
    "ownerOrgID": "270a5476-b91a-442f-8a13-852ef7bb5b9c"
  },
  "spec": {
    "fromLocalFiles": [
      "train.bin",
      "val.bin"
    ],
    "storageProvider": "",
    "sourcePath": ""
  },
  "status": {
    "status": "available",
    "storageProviderID": "00000000-0000-0000-0000-000000000000",
    "size": 2230788,
    "files": [
      {
        "path": "shakespeare_char/train.bin",
        "size": 2007708
      },
      {
        "path": "shakespeare_char/val.bin",
        "size": 223080
      }
    ],
    "createdAt": "2025-08-05T13:13:07Z",
    "updatedAt": "2025-08-05T13:13:08Z",
    "dataSyncs": []
  }
}

Getting Started

Inference

Fine-tuning

Training

Platform Services

Interactive Development

CLI

Console

Best Practices

FAQ

Download The Sample Dataset Files

Uploading files to the FlexAI Dataset Manager Service

Dataset structure

Summary

Details

Getting Started

Inference

Fine-tuning

Training

Platform Services

Interactive Development

CLI

Console

Best Practices

FAQ

​Download The Sample Dataset Files

​Uploading files to the FlexAI Dataset Manager Service

​Dataset structure

​Summary

​Details

Download The Sample Dataset Files

Uploading files to the FlexAI Dataset Manager Service

Dataset structure

Summary

Details