Fine-tuning-with-LoRA & QLoRA

This repository contains:

A brief overview of Parameter Efficient Fine Tuning (PEFT)
The process for fine-tuning/training with LoRA (Low-Rank Adaptation) and a link to a video tutorial
The process of fine-tuning with QLoRA and a link to a video tutorial

Finetuning with PEFT (Parameter Efficient Fine Tuning) Methods:

Only updating a small subset of the model's parameters, rather than all parameters
More efficient than training the whole model
PEFT techniques: Prefix Tuning and LoRA
Minimising the number of trainable parameters in a neural network
More adaptable and memory-efficient
Main methods: Prefix-Tuning, LoRA (Low-Rank Adaptation), QLoRA (Quantised LoRA) -- LoRA:
- allows small adapters to be tailored to specific datasets or users.
- Less memory needed for loading and processing
- Introduces new parameters only during the training

The first section: follow the process of simple fine-tuning with the BERT uncased model in this link: Fine-tuning LLMs Locally
Main installs in install_libraries.py.
Set up the LoRA configuration as in lora-config.py
Set up the training arguments to pass to the Trainer method, as in training_args.py.
Set up the Trainer arguments as in trainer.py.
Train/fine-tune the model using train() on the Trainer object (e.g., my_trainer in the above file).
Save the model using save_pretrained("name") on the peft model.
Optional: merge the base model with the adapter using merge_and_unload().
Below is the link to the video tutorial with LoRA:

Follow the steps in the previous section (fine-tuning with LoRA) first; here we need to make the peft model in quantised versions.
Set up the bitsandbytes configuration for quantisation as in bnb_config.py.
If needed, you can prepare the model's embedding layers for gradient updating after setting the bitsandbytes parameters.
Instead of Trainer() method (as in full fine-tuning and fine-tuning with LoRA), use the SFTTrainer() method.

Below is the link to the tutorial of fine-tuning with QLoRA:

For learning and practice purposes, start with a lightweight model which is faster to load and train, and move on to larger models only for performance.
Before using any dataset on any model (LLM), be sure that the input data format of that dataset matches the model's data template; otherwise, you need to make a function to format the data. An example of an input/output format for the decoder models such as LLAMA2 is example_format.png.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
README.md		README.md
bnb_config.py		bnb_config.py
example_format.png		example_format.png
install_libraries.py		install_libraries.py
lora-config.py		lora-config.py
model-spec.py		model-spec.py
trainer.py		trainer.py
training_args.py		training_args.py