Skip to content

N-Nieto/GenderBias_CheXNet

Repository files navigation

GenderBias_CheXNet

In this tutorial you will find all the steps and instructions you need in order to reproduce the experiments performed in "Gender Imbalance in Medical Imaging Datasets Produces Biased Classifiers for Computer-aided Diagnosis" by Agostina Larrazabal, Nicolás Nieto, Victoria Peterson, Diego H. Milone, and Enzo Ferrante. Proceedings of the National Academy of Sciences May 2020.

https://www.pnas.org/content/early/2020/05/19/1919012117

This code is based on the following publicly available implementation of CheXNet using Keras: https://github.com/brucechou1983/CheXNet-Keras

Step 0: If it is your first time coding in Python 3, you will have to install it. We recommend to install Anaconda Distribution:

You could find some straigthforward instructions in the following tutorial (up to Step 8):

https://www.digitalocean.com/community/tutorials/how-to-install-anaconda-on-ubuntu-18-04-quickstart

We use conda 4.7.12

Step 1 - Download the GenderBias_CheXNet repository:

In this repository you will find all the scripts needed to repoduce our experiments.

Step 2 - Download the X-ray images (If you already have the dataset skip this step):

  • Open a Terminal

  • Set the terminal path in the unzip GenderBias_CheXNet

(base)>> python batch_download_zips.py

This may take a while.

If you rather prefer to download the data by your own, you could find all the files here:

https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/37178474737

Step 3 - Create a Python enviroment:

1- Open a Terminal in the repository's path.

2- Run the following command:

(base)>>conda env create --name your_env_name --file requirements.txt

Some packages could not be install by conda so we have to install theme with pip inside your environmiroment.

(base)>>source activate your_env_name

(your_env_name)>> pip install pillow==4.2.0

(your_env_name)>> pip install opencv-python==4.1.0.25

(your_env_name)>> pip install imgaug==0.2.9

Step 4 - Check CUDA version compatibility:

Check your system cuda version

(your_env_name)>> nvcc --version

Update your env cuda version

(your_env_name)>> conda install cudatoolkit==your_cuda_version

Step 5 - Activate the environment with the following command:

(base)>>source activate your_env_name

You will see your environment name in the command line

(your_env_name)>>

Step 6 - Training the network:

First, make sure that in "config_file.ini" the image_source_dir contains the path where you have download the dataset.

Run the training script with the following command:

(yout_env_name)>> python3 training.py

When the training process finished, you will find the "/output" folder that contains the trained weights of the network.

Step 7 - Testing the network:

Now that you have your model trained, it is time to generate predictions in unseen data

Run the testing script with the following command:

(your_env_name)>>python3 testing.py

When the testing is over, you will find the network predictions in the "/output" folder.

As an example, for the fold 0, training with only male images and testing on female set you will find:

y_pred_run_0_train_0%_female_images_test_female.csv

Results

In this section we include results for our analysis using three different CNN architectures and two datasets of X-ray images.

Experimental results for a DenseNet classifier trained with images from the NIH dataset and the CheXpert dataset. The boxplots aggregate the results for 20 folds, training with male (blue) and female (orange) patients. Both models are evaluated given male-only and female-only test folds. A consistent decrease in terms of area under the receiver operating characteristic curve (AUC) is observed when using male patients for training and female for testing (and viceversa). Statistical significance according to Mann–Whitney U test is denoted by **** (p ≤ 0.00001), *** ( 0.00001 < p ≤ 0.0001), ** ( 0.0001 < p ≤ 0.001), * ( 0.001 < p ≤ 0.01) and ns (p > 0.01).}

Experimental results for a ResNet classifier trained with images from the NIH dataset and the CheXpert dataset. The boxplots aggregate the results for 20 folds, training with male (blue) and female (orange) patients. Both models are evaluated given male-only and female-only test folds. A consistent decrease in terms of area under the receiver operating characteristic curve (AUC) is observed when using male patients for training and female for testing (and viceversa). Statistical significance according to Mann–Whitney U test is denoted by **** (p ≤ 0.00001), *** ( 0.00001 < p ≤ 0.0001), ** ( 0.0001 < p ≤ 0.001), * ( 0.001 < p ≤ 0.01) and ns (p > 0.01).}

Experimental results for a InceptionV3 classifier trained with images from the NIH dataset and the CheXpert dataset. The boxplots aggregate the results for 20 folds, training with male (blue) and female (orange) patients. Both models are evaluated given male-only and female-only test folds. A consistent decrease in terms of area under the receiver operating characteristic curve (AUC) is observed when using male patients for training and female for testing (and viceversa). Statistical significance according to Mann–Whitney U test is denoted by **** (p ≤ 0.00001), *** ( 0.00001 < p ≤ 0.0001), ** ( 0.0001 < p ≤ 0.001), * ( 0.001 < p ≤ 0.01) and ns (p > 0.01).}

About

Gender Bias Extra Material

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages