Finetune Mistral 7b with Alpaca2k
Introduction
In this blog post, I document the steps I took and my experiences while fine-tuning the Mistral model using Axolotl. This guide is intended to serve as a reference for future projects and to reflect on the lessons learned throughout the process.
Accessing the Mistral Model
To access the Mistral model on Hugging Face, you must first accept the usage conditions.
Dataset
For this fine-tuning exercise, I utilized the Alpaca2k dataset available on Hugging Face. I chose this dataset as it is a small dataset and (it suits the intented purpose of getting things setup and test fine-tune a mistral model).
Setting Up the Environment
I used Jarvislabs to fine-tune the Mistral model. Jarvislabs offers a variety of instances and templates, including the Axolotl template. As described on their website:
is a famous library that helps in fine-tuning various LLMs like LLama, Falcon and mpt. It is built on HF Transformers library and allows you to finetune LLMs on your dataset with simple changes to a yaml file.
Selecting an Instance
I selected the RTX6000Ada instance for this task. Once the instance is running, you can choose to use the API, Jupyter, or VSCode. I opted for Jupyter for its ease of use and versatility.
Accessing Axolotl
Navigate to the terminal within Jupyter. In the terminal, change the directory to Axolotl examples:
cd axolotl/examples
Here, you will find various model options available for fine-tuning. You can explore the examples on GitHub to understand the configuration files and scripts.
Logging In to Hugging Face
Before proceeding, log in to your Hugging Face account using your access token in the cli. The access token can be found on the settings page of your Hugging Face account.
huggingface-cli login
Fine-Tuning the Model
To initiate the fine-tuning process, run the following command:
accelerate launch -m axolotl.cli.train lora.yml
If you encounter an error such as:
ValueError: eval dataset split is too small for sample_packing. You should set
'eval_sample_packing: False'
You should set eval_sample_packing
to False
in the configuration file. Update the YAML file accordingly:
eval_sample_packing: False
Running Inference
After fine-tuning, you can run inference using the following command:
accelerate launch -m axolotl.cli.inference examples/openllama-3b/lora.yml \ --lora_model_dir="./outputs/lora-out" --gradio
This command launches the inference script with Gradio, a user-friendly interface for interacting with machine learning models.