Generating Brain Tumor MRI images for Data Augmentation using Generative Adversarial Networks
A brain tumor is a collection, or mass, of abnormal cells in your brain. Your skull, which encloses your brain, is very rigid. Any growth inside such a restricted space can cause problems. Brain tumors can be cancerous (malignant) or noncancerous (benign). When benign or malignant tumors grow, they can cause the pressure inside your skull to increase. This can cause brain damage, and it can be life-threatening. Brain tumors are categorized as primary or secondary:
- A primary brain tumor originates in your brain. Many primary brain tumors are benign.
- A secondary brain tumor, also known as a metastatic brain tumor, occurs when cancer cells spread to your brain from another organ, such as lung or breast.
- An MRI uses magnetic fields to produce detailed images of the body.
MRI can be used to measure the tumor’s size. A special dye called a contrast medium is given before the scan to create a clearer picture. - This dye can be injected into a patient’s vein or given as a pill or liquid to swallow.
MRIs create more detailed pictures than CT scans and are the preferred way to diagnose a brain tumor. - The MRI may be of the brain, spinal cord, or both, depending on the type of tumor suspected and the likelihood that it will spread in the CNS.
- There are different types of MRI. The results of a neuro-examination, done by the internist or neurologist, helps determine which type of MRI to use.
- In India, every year, 40,000 - 50,000 patients are diagnosed with a brain tumor. 20 percent of them are children
- At the current population level of the country (1.417 billion), this means only 0.0035 percent are diagnosed with Brain Tumor!
- Let's assume that all MRI scans produce 100% accurate results. This would mean that for every 10,000 MRI scans, we only get 35 samples showing Brain Tumor versus many more that don't
- This, combined with other problems in accessing Medical data, would lead to Machine Learning problems such as Class Imbalance and Bias
Generative models, or deep generative models, are a class of deep learning models that learn the underlying data distribution from the sample. These models can be used to reduce data into its fundamental properties, or to generate new samples of data with new and varied properties
Generative adversarial networks are implicit likelihood models that generate data samples from the statistical distribution of the data. They’re used to copy variations within the dataset. They use a combination of two networks: generator and discriminator.
A generator network takes a random normal distribution (z), and outputs a generated sample that’s close to the original distribution.
A discriminator tries to evaluate the output generated by the generator with the original sample, and outputs a value between 0 and 1. If the value is close to 0, then the generated sample is fake, and if the value is close to 1 then the generated sample is real.
A random normal distribution is fed into the generator. The generator then outputs a random distribution, since it doesn’t have a reference point.
Meanwhile, an actual sample, or ground truth, is fed into the discriminator. The discriminator learns the distribution of the actual sample. When the generated sample from the generator is fed into the discriminator, it evaluates the distribution.
If the distribution of the generated sample is close to the original sample, then the discriminator outputs a value close to ‘1’ = real. If both the distribution doesn’t match or they aren’t even close to each other, then the discriminator outputs a value close to ‘0’ = fake.
The answer lies in the loss function or the value function; it measures the distance between the distribution of the data generated and the distribution of the real data. Both the generator and the discriminator have their own loss functions. The generator tries to minimize the loss function while the discriminator tries to maximize.