Skip to content

Latest commit

 

History

History
60 lines (43 loc) · 4.29 KB

ReadMe.md

File metadata and controls

60 lines (43 loc) · 4.29 KB

EXTS Capstone Project

Eric Scuccimarra ([email protected])

2018-05-04


This project was far too complex and wide-ranging to fit into a single notebook. The files included in this repository are listed below:

Overview

  • Report.md - describes the steps taken during this project as well as summarizes the results

Exploratory Data Analysis

  • Wisconsin (UCI) EDA.ipynb - exploratory data analysis on the Wisconsin Breast Cancer data from the UCI Machine Learning Repository
  • SVM.ipynb, kNN.ipynb, Decision Trees.ipynb, Multilayer neural networks.ipynb - standard machine learning techniques applied to the Wisconsin Breast Cancer data
  • UCI Results.ipynb - results of the above notebooks consolidated
  • MIAS Exploratory Data Analysis.ipynb - exploratory data analysis of the MIAS data and images

DDSM Data Preprocessing

  • overview_of_image_processing.md - overview of the challenges posed by and steps taken to turn the CBIS-DDSM and DDSM data into usable images
  • /Decompressing-For-LJPEG-image/ - code from GitHub used to convert the DDSM LJPEG files into PNGs
  • crop_cbis_images_x.ipynb - code used to create extract the ROIs from the CBIS-DDSM data to create dataset x
  • crop_normal_images_x.ipynb - code used to create training data from the normal, DDSM images for dataset x
  • crops_mias_images_x.ipynb - code used to create a supplementary test dataset from the MIAS data for dataset x
  • review_images_x.ipynb - review random sampling of training images for dataset x to identify potential problems
  • write_to_tfrecords_x.ipynb - combines the images from the CBIS-DDSM and DDSM datasets and writes tfrecords files used for training
  • mammo_utils.py - various functions shared across notebooks

DDSM ConvNet Training

  • training_utils.py - various functions used in creating and training convnets
  • candidate_1.0.0.x.py - python scripts used to create and train various candidate models. Only the models referenced in the report are included here.
  • vgg_16.3.py - a customized version of VGG evaluated
  • inception_v4.05.py - a model based on Inception, but significantly scaled down
  • inception_utils.py - functions used to create our Inception clone

DDSM Training Results

  • /logs/ - TensorBoard logs for selected training runs of selected models
  • model_notes.xlsx - notes on the various runs of training various models, including hyperparameters and results
  • ddsm_results.csv - consolidated results for selected training runs
  • ddsm_results.ipynb - the csv file imported into a notebook and sorted
  • convnet_training_metrics.ipynb - training and validation metrics for selected training runs. The metrics used to generate this notebook are saved as .npy files in /data/results/
  • convnet_1.0.0.35b.ipynb - code used to create and train our best model in a notebook. Includes evaluation of the trained model on the test and MIAS datasets. Note that the model checkpoints are too large to upload to GitHub so must be downloaded. URLs are provided below.

Additional Resources