This repository contains the implementation details of our Exemplar-Free Continual Transformer with Convolutions (ConTraCon) approach for continual learning with transformer backbone.
Anurag Roy, Vinay K. Verma, Sravan Voonna, Kripabandhu Ghosh, Saptarshi Ghosh, Abir Das, "Exemplar-Free Continual Transformer with Convolutions"
If you use the codes and models from this repo, please cite our work. Thanks!
@InProceedings{roy_2023_ICCV,
author = {Roy, Anurag and Verma, Vinay and Voonna, Sravan and Ghosh, Kripabandhu and Ghosh, Saptarshi and Das, Abir},
title = {Exemplar-Free Continual Transformer with Convolutions},
booktitle = {International Conference on Computer Vision (ICCV)},
year = {2023}
}
The code is written for python 3.8.16
, but should work for other version with some modifications.
pip install -r requirements.txt
-
Download the datasets to the root diretory,
Datasets
. -
CIFAR100
dataset will be automatically downloaded, while [ImageNet100
,TinyImageNet
] requires manual download. -
Overview of dataset root diretory
├── cifar100 │ └── cifar-100-python ├── tinyimagenet │ ├── tiny-imagenet-200 │ ├── train │ ├── val │ └── test └── imagenet-100 ├── imagenet-r ├── train_list.txt └── val_list.txt
NOTE -- After downloading and extracting the tinyimagenet dataset inside the Datasets
folder, run
python val_format.py
This is to change the way the test dataset is stored for tinyimagenet.
auto_run.py
- Contains the training and the inference code for the ConTraCon approach.
src/*
- Contains the source code for the backbone transformer architecture and the convolutional task adaptation mechanisms.
src/utils/model_parts.py
- Contains the task specific adaptation classes and functions.
incremental_dataloader.py
- Contains the code for the dataloaders for different datasets.
ker_sz
: kernel size of the convolution kernels which are applied on the key, query and value weight matrices of the MHSA layers
num_tasks
: Number of tasks to split the given dataset into. This will split the classes in the datasets equally among the tasks
nepochs
: Number of training epochs for each task
is_task0
: Denotes whether training the first task. For the first task, the entire backbone transformer is trained from scratch.
use_saved
: Use saved weights and resume training from next task. For example, for a 10 task setup, if trained till task 2, you can resume training from task 3 by using this flag. If training on all tasks have completed, then this flag can be used for re-evaluation of the trained model.
dataset
: Denotes the dataset.
data_path
: The path for the dataset.
scenario
: Evaluation scenario. We have evaluated our models in two scenarios -- til
(task incremental learning) and cil
(class incremental learning).
- For training on
x%
labeled data scenario, the first task needs to be trained first. This can be done by using the--is_task0
flag. - For training on subsequent tasks, run without the
-is_task0
flag.
The code to train ConTraCon on the ImageNet-100 dataset is provided as follows:
- Training the first task :
python auto_run.py --ker_sz 15 --nepochs 500 --dataset imagenet100 --data_path ./Datasets/imagenet-100/ --num_tasks 10 --is_task0 --scenario til
- Training the rest of the tasks:
python auto_run.py --ker_sz 15 --nepochs 500 --dataset imagenet100 --data_path ./Datasets/imagenet-100/ --num_tasks 10 --scenario til
- To evaluate ConTraCon in the til setup, run:
python auto_run.py --ker_sz 15 --nepochs 500 --dataset imagenet100 --data_path ./Datasets/imagenet-100/ --num_tasks 10 --use_saved --scenario til
- To evaluate ConTraCon in the cil setup, run:
python auto_run.py --ker_sz 15 --nepochs 500 --dataset imagenet100 --data_path ./Datasets/imagenet-100/ --num_tasks 10 --use_saved --scenario cil
The implementation reused some portions from CCT[1].
- Ali Hassani, Steven Walton, Nikhil Shah, Abulikemu Abuduweili, Jiachen Li, Humphrey Shi. "Escaping the Big Data Paradigm with Compact Transformers." Arxiv Preprint. 2021.