Improving robustness against common corruptions by covariate shift adaptation

Steffen Schneider*, Evgenia Rusak*, Luisa Eck, Oliver Bringmann, Wieland Brendel, Matthias Bethge

This repository contains evaluation code for the paper Improving robustness against common corruptions by covariate shift adaptation. We will release the code in the upcoming weeks. To get notified, watch and/or star this repository to get notified of updates!

Today's state-of-the-art machine vision models are vulnerable to image corruptions like blurring or compression artefacts, limiting their performance in many real-world applications. We here argue that popular benchmarks to measure model robustness against common corruptions (like ImageNet-C) underestimate model robustness in many (but not all) application scenarios. The key insight is that in many scenarios, multiple unlabeled examples of the corruptions are available and can be used for unsupervised online adaptation. Replacing the activation statistics estimated by batch normalization on the training set with the statistics of the corrupted images consistently improves the robustness across 25 different popular computer vision models. Using the corrected statistics, ResNet-50 reaches 62.2% mCE on ImageNet-C compared to 76.7% without adaptation. With the more robust AugMix model, we improve the state of the art from 56.5% mCE to 51.0% mCE. Even adapting to a single sample improves robustness for the ResNet-50 and AugMix models, and 32 samples are sufficient to improve the current state of the art for a ResNet-50 architecture. We argue that results with adapted statistics should be included whenever reporting scores in corruption benchmarks and other out-of-distribution generalization settings

Main results

Results for vanilla trained and robust models on ImageNet-C

With a simple recalculation of batch normalization statistics, we improve the mean Corruption Error (mCE) of all commonly tested robust models.

Model	mCE, w/o adapt [%] ↘	mCE, partial adapt [%] ↘	mCE, full adapt [%] ↘
Vanilla ResNet50	76.7	65.0	62.2
SIN	69.3	61.5	59.5
ANT	63.4	56.1	53.6
ANT+SIN	60.7	55.3	53.6
AugMix	65.3	55.4	51.0
AssembleNet	52.3	--	50.1
DeepAugment	60.4	52.3	49.4
DeepAugment+AugMix	53.6	48.4	45.4
DeepAug+AM+RNXt101	44.5	40.7	38.0

Results for models trained with Fixup and GroupNorm on ImageNet-C

Fixup and GN trained models perform better than non-adapted BN models but worse than adapted BN models.

Model	Fixup, mCE [%] ↘	GroupNorm, mCE [%] ↘	BatchNorm, mCE [%] ↘	BatchNorm+adapt, mCE [%] ↘
ResNet-50	72.0	72.4	76.7	62.2
ResNet-101	68.2	67.6	69.0	59.1
ResNet-152	67.6	65.4	69.3	58.0

News

The paper was accepted for poster presentation at NeurIPS 2020.
A shorter workshop version of our paper was accepted for oral presentation at the Uncertainty & Robustness in Deep Learning Workshop at ICML '20.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Improving robustness against common corruptions by covariate shift adaptation

Main results

Results for vanilla trained and robust models on ImageNet-C

Results for models trained with Fixup and GroupNorm on ImageNet-C

News

Files

README.md

Latest commit

History

README.md

File metadata and controls

Improving robustness against common corruptions by covariate shift adaptation

Main results

Results for vanilla trained and robust models on ImageNet-C

Results for models trained with Fixup and GroupNorm on ImageNet-C

News