Skip to content

Latest commit

 

History

History
156 lines (128 loc) · 18.3 KB

Model_Zoo_selfsup.md

File metadata and controls

156 lines (128 loc) · 18.3 KB

Model Zoo

Current results of self-supervised learning benchmarks are based on MMSelfSup and solo-learn. We will rerun the experiments and update more reliable results soon!

Supported sample mixing policies

ImageNet-1K pre-trained models

The training details are provided in the config files. You can click the method's name to obtain more information.

Method Config Download
Relative Location r50_8xb64_step_ep70 model
Rotation Prediction r50_8xb64_step_ep70 model
DeepCluster r50_sobel_8xb64_step_ep200 model
NPID r50_4xb64_step_ep200 model
ODC r50_8xb64_step_ep440 model
SimCLR r50_8xb64_cos_lr0_6_fp16_ep200 model
r50_16xb256_cos_lr4_8_fp16_ep200 model
MoCoV2 r50_4xb64_cos model
BYOL r50_8xb64_accu8_cos_lr4_8_fp16_ep200 model
r50_8xb64_accu8_cos_lr4_8_fp16_ep300 model
SwAV r50_8xb64_accu8_cos_lr9_6-mcrop-224_2-96_6_fp16_ep200 model
DenseCL r50_4xb64_cos model
SimSiam r50_4xb64_cos_lr0_05_ep100 model
r50_4xb64_cos_lr0_05_ep200 model
BarlowTwins r50_8xb64_accu4_cos_lr1_6_ep300 model
MoCoV3 vit_small_8xb64_accu8_cos_fp16_ep300 model
MAE vit_base_dec8_dim512_8xb128_accu4_cos_fp16_ep400 model
SimMIM swin_base_sz192_8xb128_accu2_cos_ep100 model
vit_base_rgb_m_sz224_8xb128_accu2_step_fp16_ep800 model
MaskFeat vit_base_hog_108_sz224_8xb128_accu2_cos_fp16_ep300 model
CAE vit_base_sz224_8xb64_accu4_cos_fp16_ep300 model
A2MIM r50_l3_sz224_init_8xb256_cos_ep100 model
r50_l3_sz224_init_8xb256_cos_ep300 model
vit_base_l0_sz224_8xb128_accu2_step_ep800 model

ImageNet-1K Linear Evaluation

Note

  • If not specifically indicated, the testing GPUs are NVIDIA Tesla V100 on MMSelfSup and OpenMixup. The pre-training and fine-tuning testing image size are $224\times 224$.
  • The table records the implementors who implemented the methods (either by themselves or refactoring from other repos), and the experimenters who performed experiments and reproduced the results. The experimenters should be responsible for the evaluation results on all the benchmarks, and the implementors should be responsible for the implementation as well as the results; If the experimenter is not indicated, an implementator is the experimenter by default.
  • We use config r50_multihead for ImageNet multi-heads and r50_linear for the global average pooled feature evaluation.
Methods Remarks Batch size Epochs Procotol Linear
PyTorch torchvision 256 90 MoCo 76.17
Random kaiming - - MoCo 4.35
Relative-Loc ResNet-50 512 70 MoCo 38.83
Rotation ResNet-50 128 70 MoCo 47.01
DeepCluster ResNet-50 512 200 MoCo 46.92
NPID ResNet-50 256 200 MoCo 56.60
ODC ResNet-50 512 440 MoCo 53.42
SimCLR ResNet-50 256 200 SimSiam 62.56
ResNet-50 4096 200 SimSiam 66.66
MoCov1 ResNet-50 256 200 MoCo 61.02
MoCoV2 ResNet-50 256 200 MoCo 67.69
BYOL ResNet-50 4096 200 SimSiam 71.88
ResNet-50 4096 300 SimSiam 72.93
SwAV ResNet-50 512 200 SimSiam 70.47
DenseCL ResNet-50 256 200 MoCo 63.62
SimSiam ResNet-50 512 100 SimSiam 68.28
ResNet-50 512 200 SimSiam 69.84
BarlowTwins ResNet-50 2048 300 BarlowTwins 71.66
MoCoV3 ViT-Small 4096 400 MoCoV3 73.19

ImageNet-1K End-to-end Fine-tuning Evaluation

Note

  • All compared methods adopt ResNet-50 or ViT-B architectures and are pre-trained on ImageNet-1K. The pre-training and fine-tuning testing image size are $224\times 224$, except for SimMIM with Swin-Base using $192\times 192$. The fine-tuning protocols include: RSB A3 and RSB A2 for ResNet-50, BEiT for ViT-B.
  • You can find pre-training codes of compared methods in OpenMixup, VISSL, solo-learn, and the official repositories. You can download fine-tuned models from a2mim-in1k-weights or Baidu Cloud (3q5i).
Methods Backbone Source Batch size PT epoch FT protocol FT top-1
PyTorch ResNet-50 PyTorch 256 90 RSB A3 78.8
Inpainting ResNet-50 OpenMixup 512 70 RSB A3 78.4
Relative-Loc ResNet-50 OpenMixup 512 70 RSB A3 77.8
Rotation ResNet-50 OpenMixup 128 70 RSB A3 77.7
SimCLR ResNet-50 VISSL 4096 100 RSB A3 78.5
MoCoV2 ResNet-50 OpenMixup 256 100 RSB A3 78.5
BYOL ResNet-50 OpenMixup 4096 100 RSB A3 78.7
ResNet-50 Official 4096 300 RSB A3 78.9
ResNet-50 Official 4096 300 RSB A2 80.1
SwAV ResNet-50 VISSL 4096 100 RSB A3 78.9
ResNet-50 Official 4096 400 RSB A3 79.0
ResNet-50 Official 4096 400 RSB A2 80.2
BarlowTwins ResNet-50 solo learn 2048 100 RSB A3 78.5
ResNet-50 Official 2048 300 RSB A3 78.8
MoCoV3 ResNet-50 Official 4096 100 RSB A3 78.7
ResNet-50 Official 4096 300 RSB A3 79.0
ResNet-50 Official 4096 300 RSB A2 80.1
A2MIM ResNet-50 OpenMixup 2048 100 RSB A3 78.8
ResNet-50 OpenMixup 2048 300 RSB A3 78.9
ResNet-50 OpenMixup 2048 300 RSB A2 80.4
MAE ViT-Base OpenMixup 4096 400 BEiT (MAE) 83.1
SimMIM Swin-Base OpenMixup 2048 100 BEiT (SimMIM) 82.9
ViT-Base OpenMixup 2048 800 BEiT (SimMIM) 83.9
CAE ViT-Base OpenMixup 2048 300 BEiT (CAE) 83.2
MaskFeat ViT-Base OpenMixup 2048 300 BEiT (MaskFeat) 83.5
A2MIM ViT-Base OpenMixup 2048 800 BEiT (SimMIM) 84.3

Downstream Task Benchmarks

Places205 Linear Classification

Note

  • In this benchmark, we use the config files of r50_mhead and r50_mhead_sobel. For DeepCluster, use the corresponding one with _sobel.
  • Places205 evaluates features in around 9k dimensions from different layers. Top-1 result of the last epoch is reported.

ImageNet Semi-Supervised Classification

Note

  • In this benchmark, the necks or heads are removed and only the backbone CNN is evaluated by appending a linear classification head. All parameters are fine-tuned. We use config files under imagenet_per_1 for 1% data and imagenet_per_10 for 10% data.
  • When training with 1% ImageNet, we find hyper-parameters especially the learning rate greatly influence the performance. Hence, we prepare a list of settings with the base learning rate from {0.001, 0.01, 0.1} and the learning rate multiplier for the head from {1, 10, 100}. We choose the best performing setting for each method. Please use --deterministic in this benchmark.

PASCAL VOC07+12 Object Detection

Note

  • This benchmark follows the evluation protocols set up by MoCo. model_zoo in MMSelfSup for results.
  • Config: benchmarks/detection/configs/pascal_voc_R_50_C4_24k_moco.yaml.
  • Please follow here to run the evaluation.

COCO2017 Object Detection

Note

  • This benchmark follows the evluation protocols set up by MoCo. Refer to model_zoo in MMSelfSup for results.
  • Config: benchmarks/detection/configs/coco_R_50_C4_2x_moco.yaml.
  • Please follow here to run the evaluation.