The code repository for "Mini but Mighty: Finetuning ViTs with Mini Adapters (WACV2024)" in PyTorch.
📣 Published as a conference paper at WACV 2024
Vision Transformers (ViTs) have become one of the dom- inant architectures in computer vision, and pre-trained ViT models are commonly adapted to new tasks via finetuning. Recent works proposed several parameter-efficient transfer learning methods, such as adapters, to avoid the prohibitive training and storage cost of finetuning. In this work, we observe that adapters perform poorly when the dimension of adapters is small, and we pro- pose MiMi, a training framework that addresses this is- sue. We start with large adapters which can reach high performance, and iteratively reduce their size. To en- able automatic estimation of the hidden dimension of ev- ery adapter, we also introduce a new scoring function, specifically designed for adapters, that compares neuron importance across layers. Our method outperforms ex- isting methods in finding the best trade-off between accu- racy and trained parameters across the three dataset bench- marks DomainNet, VTAB, and Multi-task, for a total of 29 datasets.
install the conda environment using the provided yml file.
Please follow the settings in the exps
folder to prepare your json files, and then run:
python main.py --config $CONFIG_FILE
python few_shot_prune.py --config $CONFIG_FILE
If you find this work helpful, please cite our paper.
@InProceedings{Marouf_2024_WACV,
author = {Marouf, Imad Eddine and Tartaglione, Enzo and Lathuili\`ere, St\'ephane},
title = {Mini but Mighty: Finetuning ViTs With Mini Adapters},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024},
pages = {1732-1741}
}