> ## Documentation Index
> Fetch the complete documentation index at: https://docs.flex.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create a Remote Storage Connection

> Set up a new remote storage connection with credentials and provider configuration

Creating a Remote Storage Connection involves two main steps:

<Steps>
  1. **Store your access credentials** securely using the FlexAI Secret Manager.

  2. **Create a Remote Storage Connection** that uses the stored credentials.
</Steps>

## Storing your Access Credentials

<Tabs>
  <Tab title="Using the FlexAI Console">
    Visit the [Create Secret](https://console.flex.ai/s/secrets/new-secret) page in the FlexAI Console to create a new secret.

    > Note that currently, the FlexAI Console only supports creating secrets with text values. If you need to create a secret with a file value (e.g., a JSON key file for Google Cloud Storage), please use the FlexAI CLI instead.
  </Tab>

  <Tab title="Using the FlexAI CLI">
    <Tabs>
      <Tab title="Amazon S3">
        To upload Datasets from Amazon S3 to FlexAI, you first need to create a Storage Provider Connection that holds the necessary information to connect to your Amazon S3 bucket.

        You will find this entry from the AWS Security Blog useful: [How to quickly find and update your access keys \[...\]](https://aws.amazon.com/blogs/security/how-to-find-update-access-keys-password-mfa-aws-management-console/).

        You will need the following:

        * Your Amazon S3 Secret Access Key
        * Your Amazon S3 Access Key ID
        * The Amazon S3 region
        * The endpoint URL associated with your Amazon S3 region

        <Steps>
          <Step title="Store Your Credentials using the FlexAI Secret Manager">
            To store your Amazon S3 Secret Access Key as an FlexAI Secret, run the following command:

            ```bash theme={null}
            flexai secret create s3-secret-access-key
            ```

            You will be prompted to enter your Amazon S3 Secret Access Key (of course you can paste it in!). Once you have entered it, hit Enter, and the Secret `s3-secret-access-key` will be created.
          </Step>
        </Steps>
      </Tab>

      <Tab title="Google Cloud Storage">
        To upload Datasets from Google Cloud Storage (GCS) to FlexAI, you first need to create a Storage Provider Connection that holds the necessary information to connect to your GCS bucket. Setting up a Storage Provider Connection for GCS leverages Google Cloud Platform's Service Account JSON key file.

        You can find more information on how to create a Service Account JSON key file in the [Google Cloud documentation](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) (Pick the Console tab for instructions on how to obtain it using the GCP web console).

        You will need the following:

        * A Google Cloud Service Account JSON key file with sufficient permissions to access the target GCS bucket.

        <Steps>
          <Step title="Store Your Credentials using the FlexAI Secret Manager">
            To store your Google Cloud Service Account JSON key file as an FlexAI Secret, run the following command:

            ```bash theme={null}
            cat <service_account_key_file_path> | flexai secret create <secret_name> --value-stdin
            ```

            For example, if your Google Cloud Service Account JSON key file is named `gcp-service-account.json`, you can store it as an FlexAI Secret named `gcp-sa` by running the following command:

            ```bash theme={null}
            cat gcp-service-account.json | flexai secret create gcp-sa --value-stdin
            ```

            This command reads the contents of your Google Cloud Service Account JSON key file and securely stores its entirety as an FlexAI Secret named `gcp-sa`.
          </Step>
        </Steps>
      </Tab>

      <Tab title="Hugging Face">
        To upload Datasets from Hugging Face Hub to FlexAI, you first need to create a Storage Provider Connection that holds the necessary information to connect to your Hugging Face Hub account.

        Setting up a Storage Provider Connection for Hugging Face Hub requires a [Hugging Face Access Token](https://huggingface.co/settings/tokens).

        You can find more information on how to create a Hugging Face Access Token in the [Hugging Face documentation](https://huggingface.co/docs/huggingface_hub/how-to-authenticate).

        <Steps>
          <Step title="Store Your Hugging Face Access Token using the FlexAI Secret Manager">
            To store your Hugging Face Access Token as an FlexAI Secret, run the following command:

            ```bash theme={null}
            flexai secret create hf_token
            ```

            You will be prompted to enter your Hugging Face Access Token. Once you have entered it, hit Enter, and the Secret `hf_token` will be created.
          </Step>
        </Steps>
      </Tab>
    </Tabs>
  </Tab>
</Tabs>

## Creating a Remote Storage Connection

<Tabs>
  <Tab title="Using the FlexAI Console">
    Currently, creating Remote Storage Connections via the FlexAI Console is not supported. Please use the FlexAI CLI instead.
  </Tab>

  <Tab title="Using the FlexAI CLI">
    <Tabs>
      <Tab title="Amazon S3">
        <Steps>
          <Step title="Create the Storage Provider Connection">
            With the Amazon S3 Secret Access Key stored as an FlexAI Secret, you can now create a Storage Provider Connection for Amazon S3 using the `flexai storage` command by following the example shown by the command string below:

            ```bash theme={null}
            flexai storage create <storage_provider_connection_name> \
              --provider s3 \
              --region <s3_region> \
              --endpoint <s3_endpoint> \
              --access-key-id <access_key_id> \
              --secret-access-key-name <name_of_the_secret_with_the_secret_access_key>
            ```

            > Note that the value of `--endpoint` will depend on the *region* where your Amazon S3 bucket is located. You can find the official [list of Amazon S3 endpoints here](https://docs.aws.amazon.com/general/latest/gr/s3.html).

            A Remote Storage Connection for an Amazon S3 bucket located in the `eu-west-1` region with the endpoint `s3.eu-west-1.amazonaws.com` and an Access Key ID `AKIAIOSFODIN7AAF89GU` would look like this:

            ```bash theme={null}
            flexai storage create aws-storage-conn-eu \
              --provider s3 \
              --region eu-west-1 \
              --endpoint s3.eu-west-1.amazonaws.com \
              --access-key-id AKIAIOSFODIN7AAF89GU \
              --secret-access-key-name s3-secret-access-key
            ```
          </Step>

          <Step title="Upload Datasets from Amazon S3 to FlexAI">
            Now you can use your newly created `aws-storage-conn-eu` Storage Provider Connection to upload Datasets from an Amazon S3 bucket directly to FlexAI by using the `flexai dataset push` command as shown by the command string below:

            ```bash theme={null}
            flexai dataset push <dataset_name> \
              --storage-provider aws-storage-conn-eu \
              --source-path <s3_bucket_name>/<s3_object_key>
            ```

            For instance, creating an FlexAI Dataset named `s3-dataset-audio` from an Amazon S3 bucket named `data-sets` with the object key `files/wav-files` using the `aws-storage-conn-eu` Storage Provider Connection would look like this:

            ```bash theme={null}
            flexai dataset push s3-dataset-audio \
              --storage-provider aws-storage-conn-eu \
              --source-path data-sets/files/wav-files
            ```
          </Step>

          <Step title="Monitor the Dataset Upload Progress">
            The progress of the Dataset upload can be monitored by using the `inspect` subcommand from `flexai dataset`:

            ```bash theme={null}
            flexai dataset inspect <dataset_name>
            ```

            Which for our example would look like this:

            ```bash theme={null}
            flexai dataset inspect s3-dataset-audio
            ```
          </Step>
        </Steps>
      </Tab>

      <Tab title="Google Cloud Storage">
        <Steps>
          <Step title="Create the Storage Provider Connection">
            With the Google Cloud Service Account JSON key file stored as an FlexAI Secret, you can now create a Storage Provider Connection for GCS using the `flexai storage` command as shown by the example below:

            ```bash theme={null}
            flexai storage create <storage_provider_connection_name> \
              --provider gcs \
              --service-account-file-name <secret_with_the_service_account_key_json_file_contents>
            ```

            For example, creating a Storage Provider Connection named `gcs-conn` that has the Service Account Key JSON details stored in the `gcp-sa` FlexAI Secret would look like this:

            ```bash theme={null}
            flexai storage create gcs-conn \
              --provider gcs \
              --service-account-file-name gcp-sa
            ```

            After running the command, the Storage Provider Connection `gcs-conn` will be created.
          </Step>

          <Step title="Upload Datasets from Google Cloud Storage to FlexAI">
            Now you can use the `gcs-conn` Storage Provider Connection to upload Datasets from a GCS bucket to FlexAI by using the `flexai dataset push` command as shown by the command string below:

            ```bash theme={null}
            flexai dataset push <dataset_name> \
              --storage-provider <storage_provider_connection_name> \
              --source-path <gcs_bucket_name>/<gcs_object_key>
            ```

            For instance, creating an FlexAI Dataset named `gcs-dataset-audio` from a GCS bucket named `data-sets` with the object key `files/wav-files` using the `gcs-conn` Storage Provider Connection would look like this:

            ```bash theme={null}
            flexai dataset push gcs-dataset-audio \
              --storage-provider gcs-conn \
              --source-path data-sets/files/wav-files
            ```

            After running the command, the dataset `gcs-dataset-audio` will begin to be synced by asynchronously copying the contents of the GCS bucket resource `data-sets/files/wav-files` into the root of the Dataset.
          </Step>

          <Step title="Monitor the Dataset Upload Progress">
            The progress of the Dataset upload can be monitored by using the `inspect` subcommand from `flexai dataset`:

            ```bash theme={null}
            flexai dataset inspect <dataset_name>
            ```

            Which for our example would look like this:

            ```bash theme={null}
            flexai dataset inspect gcs-dataset-audio
            ```
          </Step>
        </Steps>
      </Tab>

      <Tab title="Hugging Face">
        <Steps>
          <Step title="Create the Storage Provider Connection">
            With the Hugging Face Access Token stored as an FlexAI Secret, you can now create a Storage Provider Connection for Hugging Face Hub using the `flexai storage` command as shown below:

            ```bash theme={null}
            flexai storage create hf-conn \
              --provider huggingface \
              --hf-token-name hf_token
            ```

            After running the command, the Storage Provider Connection `hf-conn` will be created.
          </Step>

          <Step title="Bring a Dataset from the Hugging Face Hub to FlexAI">
            Now you can use the `hf-conn` Storage Provider Connection to push a Hugging Face Hub Dataset to to the FlexAI DatasetManager by using the `flexai dataset push` command as shown below:

            ```bash theme={null}
            flexai dataset push hf-finepdfs \
              --storage-provider hf-conn \
              --source-path HuggingFaceFW/finepdfs
            ```

            This example creates an FlexAI Dataset named `hf-finepdfs` from the Hugging Face repository `HuggingFaceFW/finepdfs` using the `hf-conn` Storage Provider Connection, pulling the contents from the `dataset` directory in the repo.

            After running the command, the dataset `hf-finepdfs` will begin to be synced by asynchronously copying the contents from Hugging Face Hub into the root of the Dataset.
          </Step>

          <Step title="Monitor the Dataset Upload Progress">
            The progress of the Dataset upload can be monitored by using the `inspect` subcommand from `flexai dataset`:

            ```bash theme={null}
            flexai dataset inspect hf-finepdfs
            ```
          </Step>
        </Steps>
      </Tab>
    </Tabs>
  </Tab>
</Tabs>

## Next Steps

Monitoring the push process of Datasets and Checkpoints:
