- Background: Diabetic retinopathy (DR) is an eye disease that can lead to blindness and vision loss in people who have diabetes. In its early stages, diabetic retinopathy often shows no symptoms. However, as it progresses, it can result in a gradual decline in visual sharpness, potentially leading to complete blindness. According to the World Health Organization (WHO), 4.8% of the 37 million cases of blindness worldwide are attributed to diabetic retinopathy. This percentage is constantly rising, underscoring the critical need for timely and accurate diagnosis of diabetic retinopathy. Early detection is essential in preventing the progression of the disease and reducing the risk of severe vision loss, making it a significant public health priority.
- Data Description: The dataset utilized for this project is the EyePACS dataset, sourced from the Diabetic Retinopathy Detection Challenge on Kaggle. It consists of 35.126 high resolution retinal images.
- Data Preprocessing: We employed two ways of processing the original images in order to get them in a reasonable size of 512x512 and crop the retina part of the images.
- Overview of Models: In this project, we have employed various neural network architectures to address the challenge of DR grading. Their implementation can be found below:
- Simple CNN: Custom CNN for baseline comparisons. (View Notebook)
- EfficientNet: Pretrained EfficientNet model, weights from ImageNet. (View Notebook)
- Inception v3: Implementation of Inception v3, weights from ImageNet. (View Notebook)
- ResNet: Application of ResNet model, weights from ImageNet. (View Notebook)
- Vision Transformer (ViT): Utilizing ViT for image classification, weights from ImageNet. (View Notebook)
- BiraNet: Modified BiRA-Net with EfficientNetb3 backbone. Source: (View Notebook)
- Siamese Network: Combine left and right eye information. (View Notebook)
- Visualization: Heatmaps of our final model for multiple examples of profilerative DR retinas, using Grad-CAM. (View Notebook)
We propose a Siamese-Like Network for DR grading in order to process and combine information from pairs of images, combining left and right eye information. This method is based on the idea that combining characteristics from both eyes can improve the predictions for DR grading since DR-related alterations frequently occur in both eyes, although with differing degrees. This network has two principal components - the feature extraction branch and the classification head. The feature extraction is conducted using two branches of the best-performing model (EfficientNetB3). These branches are identical in structure but operate independently, processing the left and right eye images separately. The classification head of our network is a custom-designed neural network built on top the feature extraction branches. The Siamese Network is trained to concatenate the features from both eyes using learnable weights. The architecture is shown in the figure below:
- Clone the repository:
git clone https://github.com/Stefanstud/CS502-diabetic-retinopathy-detection.git cd CS502-diabetic-retinopathy-detection
- Create an environment using Python 3.8.18
conda create --name dr_grading python==3.8.18
- Activate the environment
conda activate dr_grading
- Install the required packages:
pip install -r requirements.txt
- In section 1 of each notebook, there is a code for downloading the data used in this project. There are two directories,
images
andimages_keep_ar
. Using either of them is fine, however the second oneimages_keep_ar
yielded better results. The pre-processed test data also comes with this folder. - After downloading the data, the folder should be organized in the following way:
├── data
│ ├── images
│ ├── images_keep_ar
│ ├── labels
│ │ └── trainLabels.csv
│ └── test
├── notebooks
│ ├── bira_net.ipynb
│ ├── efficient_net.ipynb
│ ├── inception_v3.ipynb
│ ├── resnet.ipynb
│ ├── siamese_net.ipynb
│ ├── simple_cnn.ipynb
│ └── vit.ipynb
├── requirements.txt
├── results
│ ├── figures
│ └── models
├── src
│ ├── loading.py
│ ├── models
│ │ ├── bira_net.py
│ │ ├── siamese_net.py
│ │ └── simple_cnn.py
│ ├── preprocessing
│ │ ├── preprocessing_1.py
│ │ └── preprocessing_2.py
│ ├── train.py
│ └── utils.py
└── submission.csv
To reproduce the best performing model, follow the detailed steps outlined in the project report. We provide three model files for flexibility and further research:
- eff_net_400x400.pt: An EfficientNet model trained through multiple steps, achieving a 0.733 score on the private Kaggle test set.
- siamese_net_400x400_2.pt: Our best performing model in terms of Quadratic Weighted Kappa, with a score of 0.764 on the private Kaggle test set.
- siamese_net_400x400_3.pt: A model that utilizes penalty weights. While it has a slightly lower kappa score, it demonstrates a better confusion matrix.
For quick reproduction, you may use these pre-trained models. Alternatively, for potential customization, you can train the model from scratch following the guidelines in Table II of the report.
We also include checkpoints in the repository intended for those who wish to continue refining and improving the model. These checkpoints serve as a starting point for further training, allowing you to build upon the existing work without starting from the beginning.
To generate the best submission file, you can run the siamese_net.ipynb
notebook provided in this repository, more precisely sections 0, 1, 2 and 4 of this notebook (to use our best performing model without needing to train it again). This notebook is specifically set up to work with the provided model files, which is designed for generating a submission file using our best performing model.