This repository contains:
- A brief overview of Parameter Efficient Fine Tuning (PEFT)
- The process for fine-tuning/training with LoRA (Low-Rank Adaptation) and a link to a video tutorial
- The process of fine-tuning with QLoRA and a link to a video tutorial
- Only updating a small subset of the model's parameters, rather than all parameters
- More efficient than training the whole model
- PEFT techniques: Prefix Tuning and LoRA
- Minimising the number of trainable parameters in a neural network
- More adaptable and memory-efficient
- Main methods: Prefix-Tuning, LoRA (Low-Rank Adaptation), QLoRA (Quantised LoRA)
-- LoRA:
- allows small adapters to be tailored to specific datasets or users.
- Less memory needed for loading and processing
- Introduces new parameters only during the training
- The first section: follow the process of simple fine-tuning with the BERT uncased model in this link: Fine-tuning LLMs Locally
- Main installs in
install_libraries.py
. - Set up the LoRA configuration as in
lora-config.py
- Set up the training arguments to pass to the Trainer method, as in
training_args.py
. - Set up the
Trainer
arguments as intrainer.py
. - Train/fine-tune the model using
train()
on the Trainer object (e.g., my_trainer in the above file). - Save the model using
save_pretrained("name")
on the peft model. - Optional: merge the base model with the adapter using
merge_and_unload()
. - Below is the link to the video tutorial with LoRA:
- Follow the steps in the previous section (fine-tuning with LoRA) first; here we need to make the peft model in quantised versions.
- Set up the bitsandbytes configuration for quantisation as in
bnb_config.py
. - If needed, you can prepare the model's embedding layers for gradient updating after setting the bitsandbytes parameters.
- Instead of
Trainer()
method (as in full fine-tuning and fine-tuning with LoRA), use theSFTTrainer()
method.
Below is the link to the tutorial of fine-tuning with QLoRA:
- For learning and practice purposes, start with a lightweight model which is faster to load and train, and move on to larger models only for performance.
- Before using any dataset on any model (LLM), be sure that the input data format of that dataset matches the model's data template; otherwise, you need to make a function to format the data. An example of an input/output format for the decoder models such as LLAMA2 is
example_format.png
.