Shapley values are a theoretically grounded model explanation approach, but their exponential computational cost makes them difficult to use with large deep learning models. This package implements ViT-Shapley, an approach that makes Shapley values practical for vision transformer (ViT) models. The key idea is to learn an amortized explainer model that generates explanations in a single forward pass.
The high-level workflow for using ViT-Shapley is the following:
- Obtain your initial ViT model
- If your model was not trained to acommodate held-out image patches, fine-tune it with random masking
- Train an explainer model using ViT-Shapley's custom loss function (often by fine-tuning parameters of the original ViT)
Please see our paper here for more details, as well as the work that ViT-Shapley builds on (KernelSHAP, FastSHAP).
git clone https://github.com/chanwkimlab/vit-shapley.git
cd vit-shapley
pip install -r requirements.txt
Commands for training and testing the models are available in the files under scripts
directory.
- scripts/training_classifier.md
- scripts/training_surrogate.md
- scripts/training_explainer.md
- scripts/training_classifier_masked.md
- Run
notebooks/2_1_benchmarking.ipynb
to obtain results. - Run
notebooks/2_2_ROAR.ipynb
to run retraining-based ROAR benchmarking. - Run
notebooks/3_plotting.ipynb
to plot the results.
Pretrained model weights for vit-base models are available here.
You can try out ViT Shapley using Colab
If you use any part of this code and pretrained weights for your own purpose, please cite our paper.
- Ian Covert (Paul G. Allen School of Computer Science and Engineering @ University of Washington)
- Chanwoo Kim (Paul G. Allen School of Computer Science and Engineering @ University of Washington)
- Su-In Lee (Paul G. Allen School of Computer Science and Engineering @ University of Washington)