Skip to main content

Google Cloud Storage

A Storage Provider Connection for Google Cloud Storage

To upload Datasets from Google Cloud Storage (GCS) to FCS, you first need to create a Storage Provider Connection that holds the necessary information to connect to your GCS bucket. Setting up a Storage Provider Connection for GCS leverages Google Cloud Platform's Service Account JSON key file. You can find more information on how to create a Service Account JSON key file in the Google Cloud documentation (Pick the Console tab for instructions on how to obtain it using the GCP web console).

Prerequisites

Before creating a Storage Provider Connection, you need to have the following information at hand:

  • A Google Cloud Service Account JSON key file with sufficient permissions to access the target GCS bucket.

Storing Your Google Cloud Service Account Key

To store your Google Cloud Service Account JSON key file as an FCS Secret, run the following command:

cat <service_account_key.json> | flexai secret create <secret_name> --value-stdin

For example, if your Google Cloud Service Account JSON key file is named gcp-service-account.json, you can store it as an FCS Secret named gcp-sa by running the following command:

cat gcp-service-account.json | flexai secret create gcp-sa --value-stdin

This command reads the contents of your Google Cloud Service Account JSON key file and securely stores its entirety as an FCS Secret named gcp-sa.

Creating the Storage Provider Connection

With the Google Cloud Service Account JSON key file stored as an FCS Secret, you can now create a Storage Provider Connection for GCS using the flexai storage command as shown by the example below:

flexai storage create <storage_provider_connection_name> \
--provider gcs \
--service-account-file-name <secret_with_the_service_account_key_json_file_contents>

For example, creating a Storage Provider Connection named gcs-conn that has the Service Account Key JSON details stored in the gcp-sa FCS Secret would look like this:

flexai storage create gcs-conn \
--provider gcs \
--service-account-file-name gcp-sa

After running the command, the Storage Provider Connection gcs-conn will be created.

Uploading Datasets from Google Cloud Storage to FCS

Now you can use the gcs-conn Storage Provider Connection to upload Datasets from a GCS bucket to FCS by using the flexai dataset push command as shown below:

flexai dataset push <dataset_name> \
--storage-provider <storage_provider_connection_name> \
--source-path <gcs_bucket_name>/<gcs_object_key>

For instance, creating an FCS Dataset named gcs-dataset-audio from a GCS bucket named data-sets with the object key files/wav-files using the gcs-conn Storage Provider Connection would look like this:

flexai dataset push gcs-dataset-audio \
--storage-provider gcs-conn \
--source-path data-sets/files/wav-files

After running the command, the dataset gcs-dataset-audio will begin to be synced by asynchronously copying the contents of the GCS bucket resource data-sets/files/wav-files into the root of the Dataset.

Monitoring the Dataset Upload Progress

You can monitor the progress of the Dataset upload with the help of the inspect subcommand from flexai dataset:

flexai dataset inspect <dataset_name>

Which for our example would look like this:

flexai dataset inspect gcs-dataset-audio