This repository contains code for:
Shared Interest: Measuring Human-AI Alignment to Identify Recurring Patterns in Model Behavior
Authors: Angie Boggust, Benjamin Hoover, Arvind Satyanarayan, and Hendrik Strobelt
Shared Interest is a method to quantify model behavior by comparing human and model decision making. In Shared Interest, human decision is approximated via ground truth annotations and model decision making is approximated via saliency. By quantifying each instance in a dataset, Shared Interest can enable large-scale analysis of model behavior.
Install the method locally for use in other development projects. It can be referenced as shared_interest
within this package and in other locations.
cd shared-interest
pip install -e git+https://github.com/mitvis/shared-interest.git#egg=shared_interest
Shared Interest relies on saliency methods to compute model behavior. The examples within this repo rely on the repo interpretability_methods
. If you are planning to run the example notebook as is, then install the interpretability_methods
. Otherwise, you can skip this step.
pip install git+https://github.com/aboggust/interpretability-methods.git
Requirements are listed in requirements.txt
. Install via:
pip install -r requirements.txt
See notebook for example usage!
The ImageNet file structure is incompatable with PyTorch's ImageFolder Dataset. To convert the ImageNet file structure see imagenet_download_util/
.