This repository contains the official implementation of the AAAI 2020 paper Context-Transformer: Tackling Object Confusion for Few-Shot Detection.
To tackle the object confusion problem in few-shot detection, we propose a novel Context-Transformer within a concise deep transfer framework. Specifically, Context-Transformer can effectively leverage source-domain object knowledge as guidance, and automatically formulate relational context clues to enhance the detector's generalization capcity to the target domain. It can be flexibly embedded in the popular SSD-style detectors, which makes it a plug-and-play module for end-to-end few-shot learning. For more details, please refer to our original paper.
Method | 1shot | 5shot |
---|---|---|
Prototype | 22.8 | 39.8 |
Imprinted | 24.5 | 40.9 |
Non-local | 25.2 | 41.0 |
Baseline | 21.5 | 39.4 |
Ours | 27.0 | 43.8 |
News: We now support instance shot for COCO60 to VOC20 transfer setting, denoted by suffix -IS
below.
Method | 1shot | 5shot |
---|---|---|
Baseline-IS | 19.2 | 35.7 |
Ours-IS | 27.1 | 40.4 |
Note:
- The instance shots are kept the same as incremental setting, which is different from the image shots we originally used in transfer setting. Therefore, it's possible that the 1-shot result of Ours-IS (27.1) is comparable to Ours (27.0).
Method (1-shot) | Split1 | Split2 | Split3 |
---|---|---|---|
Shmelkov2017 | 23.9 | 19.2 | 21.4 |
Kang2019 | 14.8 | 15.7 | 19.2 |
Ours | 39.8 | 32.5 | 34.0 |
Method (5-shot) | Split1 | Split2 | Split3 |
---|---|---|---|
Shmelkov2017 | 38.8 | 32.5 | 31.8 |
Kang2019 | 33.9 | 30.1 | 40.6 |
Ours | 44.2 | 36.3 | 40.8 |
Note:
- The results here is higher than that reported in the paper due to training strategy adjustment.
Context-Transformer is released under the MIT License (refer to the LICENSE file for details).
If you find Context-Transformer useful in your research, please consider citing:
@inproceedings{yang2020context,
title={Context-Transformer: Tackling Object Confusion for Few-Shot Detection.},
author={Yang, Ze and Wang, Yali and Chen, Xianyu and Liu, Jianzhuang and Qiao, Yu},
booktitle={AAAI},
pages={12653--12660},
year={2020}
}
-
Clone this repository. This repository is mainly based on RFBNet and Detectron2, many thanks to them.
-
Install anaconda and requirements:
-
python 3.6
-
PyTorch 1.4.0
-
CUDA 10.0
-
gcc 5.4
-
cython
-
opencv
-
matplotlib
-
tabulate
-
termcolor
-
tensorboard
You can setup the entire environment simply using
conda
:conda create -n CT python=3.6 && conda activate CT conda install pytorch torchvision cudatoolkit=10.0 -c pytorch conda install cython opencv matplotlib tabulate termcolor tensorboard
-
-
Compile the nms and coco tools:
sh make.sh
Note:
- Check your GPU architecture support in utils/build.py, line 131. Default is:
'nvcc': ['-arch=sm_61',
- Ensure that the cuda environment is integrally installed, including compiler, tools and libraries. Plus, make sure the cudatoolkit version in the conda environment matches with the one you compile with. Check about that using
nvcc -V
andconda list | grep cudatoolkit
, the output version should be the same. - We have test the code on PyTorch-1.4.0 and Python 3.6. It might be able to run on other versions but with no guarantee.
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>
ln -s /path/to/VOCdevkit data/VOCdevkit
Move the Main2007.zip and Main2012.zip under data/
folder to data/VOCdevkit/VOC2007/ImageSets/
and data/VOCdevkit/VOC2012/ImageSets/
respectively, and unzip them. Make sure that the .txt files contained in the zip file are under corresponding path/to/Main/
folder.
Download the MS COCO dataset from official website to data/COCO/
(or make a symlink ln -s /path/to/coco data/COCO
). All annotation files (.json) should be placed under the COCO/annotations/
folder. It should have this basic structure
$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/train2014/
$COCO/images/val2014/
Note: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.
Run the following command to obtain nonvoc/voc split annotation files (.json):
python data/split_coco_dataset_voc_nonvoc.py
First download the fc-reduced VGG-16 PyTorch base network weights at https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
or from BaiduYun Driver, and place it under the directory weights/
.
To pretrain RFBNet on source domain dataset COCO60:
python train.py --save-folder weights/COCO60_pretrain -d COCO -p 1
To pretrain RFBNet on VOC split1 (simply change --split
for other splits):
python train.py --save-folder weights/VOC_split1_pretrain -d VOC -p 1 -max 50000 --steps 30000 40000 --checkpoint-period 4000 --warmup-iter 1000 --setting incre --split 1
Note:
- To ease your reproduce, feel free to download the above pretrained RFBNet models via BaiduYun Driver or OneDrive directly.
To finetune on VOC dataset (1 shot):
python train.py --load-file weights/COCO60_pretrain/model_final.pth --save-folder weights/fewshot/transfer/VOC_1shot -d VOC -p 2 --shot 1 --method ours -max 2000 --steps 1500 1750 --checkpoint-period 200 --warmup-iter 0 --no-mixup-iter 750 -b 20
To finetune on VOC dataset (5 shot):
python train.py --load-file weights/COCO60_pretrain/model_final.pth --save-folder weights/fewshot/transfer/VOC_5shot -d VOC -p 2 --shot 5 --method ours -max 4000 --steps 3000 3500 --checkpoint-period 500 --warmup-iter 0 --no-mixup-iter 1500
To finetune on VOC dataset for split1 setting (1 shot):
python train.py -d VOC --split 1 --setting incre -p 2 -m ours --shot 1 --save-folder weights/fewshot/incre/VOC_split1_1shot --load-file weights/VOC_split1_pretrain/model_final.pth -max 200 --steps 150 --checkpoint-period 50 --warmup-iter 0 --no-mixup-iter 100
To finetune on VOC dataset for split1 setting (5 shot):
python train.py -d VOC --split 1 --setting incre -p 2 -m ours --shot 5 --save-folder weights/fewshot/incre/VOC_split1_5shot --load-file weights/VOC_split1_pretrain/model_final.pth -max 400 --steps 350 --checkpoint-period 50 --warmup-iter 0 --no-mixup-iter 100
Note:
- Simply change
--split
for other split settings. - For other shot settings, feel free to adjust
--shot
,-max
,--steps
and--no-mixup-iter
to obtain satisfactory results.
To evaluate the pretrained model on COCO minival set:
python test.py -d COCO -p 1 --save-folder weights/COCO60_pretrain --resume
To evaluate the pretrained model on VOC2007 test set (specify your target split via --split
):
python test.py -d VOC --split 1 --setting incre -p 1 --save-folder weights/VOC_split1_pretrain --resume
To evaluate the transferred model on VOC2007 test set:
python test.py -d VOC -p 2 --save-folder weights/fewshot/transfer/VOC_5shot --resume
To evaluate the incremental model on VOC2007 test set (specify your target split via --split
):
python test.py -d VOC --split 1 --setting incre -p 2 --save-folder weights/fewshot/incre/VOC_split1_5shot --resume
Note:
- --resume: load model from the last checkpoint in the folder
--save-folder
.
If you would like to manually specify the path to load model, use --load-file path/to/model.pth
instead of --resume
.
Should you have any questions regarding this repo, feel free to email me at [email protected].