Diffusers library.
The goal is to specialize the model for generating content related to Naruto.
Fine-tuning enables the model to better produce outputs tailored to the themes and characteristics of the dataset, which may include unique styles, characters, or captions associated with the Naruto universe.
Note: If you haven’t already connected FlexAI to GitHub, you’ll need to runflexai code-registry connectto set up a code registry connection. This allows FlexAI to pull repositories directly using the-uflag in training commands.
Prepare the Dataset
We will be using thelambdalabs/naruto-blip-captions dataset. You can download the pre-processed version of the dataset by running the following command:
If you’d like to reproduce the pre-processing steps yourself to use a different dataset or simply to learn more about the process, you can refer to the Manual Dataset Pre-processing section below.Next, push the contents of the
sdxl-tokenized-naruto/ directory as a new FlexAI dataset:
Training
To start the Training Job, run the following command:Optional Extra Steps
Manual Dataset Pre-processing
If you’d prefer to perform the dataset pre-processing step yourself, you can follow these instructions. You can run these in a FlexAI Interactive Session or in a local env (e.g.pipenv install --python 3.10), if you have hardware that’s capable of doing inference.
Clone this repository
If you haven’t already, clone this repository on your host machine:Install the dependencies
Depending on your environment, you might need to install - if not already - the experiments’ dependencies by running:Dataset preparation
Prepare the dataset by running the following command:sdxl-tokenized-naruto/ directory.