Quickstart: FlexAI Inference Endpoints

In this guide, we will walk you through the steps required to deploy a FlexAI Inference Endpoint, how to query it, and how to manage it.

Prerequisites

A FlexAI account. If you don’t have one, you can sign up for free 🔗.
We will be exploring how to leverage the Hugging Face Model Hub to source models, and depending on the model, a Hugging Face Access Token may be required. You can create an access token by following the instructions in the Hugging Face documentation 🔗.

Deploying a FlexAI Inference Endpoint from a model hosted in the Hugging Face Hub.
- Creating a FlexAI Secret: to securely store your Hugging Face Access Token.
Deploying a FlexAI Inference Endpoint from a model fine-tuned with FlexAI.
- Pushing a checkpoint to the FlexAI Checkpoint Manager to create an Inference Endpoint from a custom model.
Querying a FlexAI Inference Endpoint.
- Making an HTTP request: to the Inference Endpoint with common tools and libraries.