Skip to content

Uploading a Dataset

Download The Sample Dataset Files

You can upload files or directories from your local machine to create a new Dataset.

For this quickstart tutorial, we will use two pre-generated files: train.bin and val.bin that you can download to your machine by using the links below:


Uploading files to the FlexAI Dataset Manager Service

Navigate to the Dataset Manager in the FlexAI Console:

  1. Visit the β€œAdd Dataset” section of the FlexAI Console πŸ”—
  2. Enter a name for your dataset: nanoGPT-dataset
  3. Select the β€œLocal” option for β€œUpload Origin”
  4. Select the + Upload Item button to open the β€œUpload Items” dialog

Use the β€œSelect file” option to open a file browser dialog:

  1. Select both train.bin and val.bin files from your local machine
    • Depending on your system and browser, you might need to hold down the Ctrl / Cmd key while selecting multiple files
    • When selecting multiple files, your browser might prompt you with a confirmation message asking you to allow multiple file selection.
    • You can also select files individually if you prefer
  2. In the β€œDestination Path” field, enter shakespeare_char
  3. Select the Add button to confirm the file selection and destination mapping

The β€œUpload Items” dialog will close and you will get back to the β€œAdd a Dataset” form, where you will see the files you just added listed under the β€œUpload Items” section. Here you will see a list of files similar to the one below:

  • Directoryshakespeare_char/ # The β€œDestination Path” you specified
    • train.bin 1.91MB
    • val.bin 217.85KB

Below the file list you will find an + Add items button that will open up the β€œUpload Items” dialog again, in case you want to add more files or directories to the Dataset.

Finally, select the Add Dataset button to start the upload process.


Dataset structure

Once upload is complete, you can select the gear icon βš™οΈ (labeled as Configure) in the Actions field of the Dataset list page. This will open the Dataset β€œDetails” panel where you will be able to see the Dataset’s name, status, creation date, and a list of files that were uploaded as part of the Dataset.

Summary

  • Name: nanoGPT-dataset
  • Status: Ready
  • Creation Time: 8/5/2025, 2:13:07 PM

Details

  • Directoryshakespeare_char/
    • train.bin 1.191MB
    • val.bin 217.85KB

To learn more about the ways your workloads can access Datasets, check out the Runtime Access section of the Dataset Manager overview page.