Skip to main content

Uploading Datasets to FCS

Datasets are essential for training or fine-tuning your models. So, we have made it as easy as possible to upload your datasets to FCS from a growing list of sources.

This guide will walk you through the process of uploading your datasets to FCS. But before diving into the methods of uploading datasets, let's first understand the structure of a dataset in FCS.

Datasets in FCS

Location

Datasets in FCS are mounted as read-only directories into the Training Job's runtime environment's /input/ path. Given the fact that a Training Job can run with multiple Datasets, each of them will be mounted in a separate subdirectory identified by the name of the Dataset.

For example, the contents of a Dataset called my_dataset will be mounted into the /input/my_dataset/ path.

This means that your training scripts should be configured to read the dataset files from this location.

File types

Datasets can contain any type of file, such as images, text files, video, binaries, or any type of data that your model requires for training.

Dataset files can also be encrypted or compressed; it will be up to your training script to handle these files accordingly.

File structure

The file structure of a dataset in FCS is completely flexible. You can use a flat structure with all files in the root directory or a more complex structure with subdirectories.

Datasets are immutable, meaning that once a dataset is created, it cannot be modified, but, since you can mount multiple datasets into the same Training Job, you can create a new dataset including any additional files or directories you need without having to modify and re-upload the original dataset, saving you time and effort.

Uploading Datasets

There are two types of sources from which you can upload Datasets to FCS: