Studying well-structuredness of iBOT's learned feature space using Linear Probing, K-Nearest Neighbors, K-Means and Agglomerative Clustering.
System Requirements:
- Python 3.7.9
- Cuda 11.0
Install packages by running
pip install -r requirements.txt
Make sure to download the PASCAL VOC Dataset and the models pretrained on ImageNet-22K:
Each method can be evaluated by running its respective script
python eval_linear.py
python eval_knn.py
python eval_kmeans.py
python eval_agglomerative.py
together with the specified settings. For further details, please either run the script
with a --help
flag or refer to our provided example
bash scripts.
Linear Probing
Arch | Intermediate | Query | Key | Value | ||||||||
10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | |
ViT-Base | 0.520 | 0.623 | 0.654 | 0.409 | 0.562 | 0.603 | 0.289 | 0.293 | 0.364 | 0.408 | 0.527 | 0.577 |
ViT-Large | 0.517 | 0.655 | 0.675 | 0.462 | 0.603 | 0.619 | 0.322 | 0.450 | 0.448 | 0.478 | 0.615 | 0.637 |
K-Nearest Neighbor
Arch | Intermediate | Query | Key | Value | ||||||||
10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | |
ViT-Base | 0.460 | 0.528 | 0.544 | 0.357 | 0.450 | 0.474 | 0.296 | 0.389 | 0.398 | 0.432 | 0.488 | 0.502 |
ViT-Large | 0.511 | 0.580 | 0.601 | 0.451 | 0.544 | 0.575 | 0.454 | 0.540 | 0.559 | 0.477 | 0.551 | 0.574 |
K-Means
Arch | Intermediate | Query | Key | Value | ||||||||
10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | |
ViT-Base | 0.440 | 0.477 | 0.480 | 0.324 | 0.332 | 0.341 | 0.358 | 0.381 | 0.395 | 0.415 | 0.453 | 0.450 |
ViT-Large | 0.475 | 0.505 | 0.512 | 0.434 | 0.457 | 0.462 | 0.449 | 0.469 | 0.472 | 0.449 | 0.457 | 0.467 |
Agglomerative Clustering
Arch | Intermediate | Query | Key | Value | ||||||||
10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | 10% | 50% | 100% | |
ViT-Base | 0.483 | 0.483 | 0.509 | 0.357 | 0.339 | 0.357 | 0.391 | 0.380 | 0.395 | 0.426 | 0.447 | 0.454 |
ViT-Large | 0.506 | 0.511 | 0.537 | 0.402 | 0.414 | 0.417 | 0.406 | 0.436 | 0.451 | 0.445 | 0.463 | 0.486 |