Inconsistency-Based Data-Centric Active Open-Set Annotation

An implementation of the NEAT batch active learning algorithm for Open-Set Annotation. Details are provided in our paper: https://ojs.aaai.org/index.php/AAAI/article/view/28213

1. Requirements

Environments

Currently, requires following packages. (We are using CUDA == 11.6, python == 3.9, pytorch == 1.13.0, torchvision == 0.14.0, scikit-learn == 1.2.2, matplotlib == 3.7.1, numpy == 1.23.5)

CUDA 10.1+
python 3.7.9+
pytorch 1.7.1+
torchvision 0.8.2+
scikit-learn 0.24.0+
matplotlib 3.3.3+
numpy 1.19.2+

Datasets

For CIFAR10 and CIFAR100, we provide a function to automatically download and preprocess the data, you can also download the datasets from the link, and please download it to ~/data folder.

2. Get started

$ cd Active-OpenSet-NEAT

Although We have provided scripts to automatically download CIFAR10 and CIFAR100 dataset but for Tiny-Imagenet or Imagenet you are supposed to download yourself utilizing the following command or using the link to download manually.

$ mkdir data
$ cd data
$ wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
$ unzip tiny-imagenet-200.zip

After you have all the dataset available you also need to run the extract_features.py to extract features using CLIP for all dataset if you want to specify certain dataset please change the script.

$ mkdir features
$ python extract_features.py

3. Training all the active learning strategies mentioned in our paper

run the following command in the terminal(example).
You have the freedom to adjust the arguments according to your interest in modifying the command, depending on the "Option" provided below.

$ python NEAT_main.py --gpu 1 --k 10 --save-dir log_AL/ --query-strategy NEAT --init-percent 1 --known-class 2 --query-batch 400 --seed 2 --model resnet18 --dataset cifar10

Option
--datatset: cifar10, cifar100, and Tiny-Imagenet.
--known-class: 2, 20, and 40 for cifar10, cifar100, Tiny-Imagenet respectively in our experiments.
--init-percent: 1, 8, 8 for cifar10, cifar100, Tiny-Imagenet respectively in our experiments.
--query-batch 400 in our experiments
--model: 'resnet18', 'resnet34', 'resnet50', and 'vgg16'
--query-strategy: 'random', 'uncertainty', 'AV_temperature', 'NEAT_passive', 'NEAT', "BGADL", "OpenMax", "Core_set", 'BADGE_sampling', "certainty", "hybrid-BGADL", "hybrid-OpenMax", "hybrid-Core_set", "hybrid-BADGE_sampling", "hybrid-uncertainty"
--workers: default 4 in our setup if you only have one gpu please set to 0.
--max-epoch: 100 in our experiments.
--max-query: 10 in our experiments.
--k: 10 number of neighbors you can change this number based on your research requirements.
--pre-type: default is clip, and you can introduce your pre-trained model base on your interest.

4. Evaluation

To evaluate the performance of NEAT, we provide a set of plotting python scripts.

Option
the following jupyter notebook will plot Accuracy, Precision, and Recall for all the active learning methods mentioned in our paper.

$ plot_baseline_batch_size_400.ipynb

the following python file can plot Accuracy, Precision, and Recall for different number of neighbors.

$ plot_different_k.py

the following python file can plot Accuracy, Precision, and Recall for different pre-trained models.

$ plot_pretrained_model.py

the following python file can plot the Accuracy, Precision, and Recall for all active learning methods with batch size of 600 and 800.

$ plot_batch_600_and_800.py

the following python file can plot the T-sne representation of comparison of NEAT and LFOSA.

$ T-sne_plot_NEAT_and_LFOSA.py

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
log_AL		log_AL
.gitignore		.gitignore
NEAT_main.py		NEAT_main.py
README.md		README.md
Sampling.py		Sampling.py
T-sne_plot_NEAT_and_LFOSA.py		T-sne_plot_NEAT_and_LFOSA.py
center_loss.py		center_loss.py
datasets.py		datasets.py
extract_features.py		extract_features.py
models.py		models.py
plot_baseline_600_batch_size_600_or_800.py		plot_baseline_600_batch_size_600_or_800.py
plot_baseline_batch_size_400.ipynb		plot_baseline_batch_size_400.ipynb
plot_batch_600_and_800.py		plot_batch_600_and_800.py
plot_different_k.py		plot_different_k.py
plot_pretrained_model.py		plot_pretrained_model.py
resnet.py		resnet.py
resnet_image.py		resnet_image.py
transfer_file_to_google_drive.py		transfer_file_to_google_drive.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inconsistency-Based Data-Centric Active Open-Set Annotation

1. Requirements

Environments

Datasets

2. Get started

3. Training all the active learning strategies mentioned in our paper

4. Evaluation

About

Releases

Packages

Contributors 3

Languages

RuiyuM/Active-OpenSet-NEAT

Folders and files

Latest commit

History

Repository files navigation

Inconsistency-Based Data-Centric Active Open-Set Annotation

1. Requirements

Environments

Datasets

2. Get started

3. Training all the active learning strategies mentioned in our paper

4. Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages