Jordan Micah Bennett, software engineer/creator of "RobotizeJa".
Note: The animation above represents a Drag&Drop version, separate from the instance discussed on this page. The Drag&Drop version version does the same thing as the non-Drag&Drop version, with the exception of the Drag&Drop feature. The Drag&Drop version is available here.
The aim was to develop a quick way to detect the nCov 2019 (Coronavirus 2019/2020, also called disease: "Covid-19" stemming from virus: "SARS-CoV-2") strain, and as such artificial neural networks were used to develop systems in line with the initial aim.
This project began on January 29, 2020, here: SMART-CORONA_VIRUS_DETECTOR. This Xray-scan version (also the first known global attempt/publication of image analysis/Artificial Intelligence based nCov/Covid19 diagnosis code) began on Feb 9, 2020.
As this is the first known attempt, commencing on January 29 2020 aimed at collaborating to construct this type of program, please point to open source packages with similar goals. Please email [email protected].
-
This can also reasonably allow for less experienced medical personnel to make preliminary diagnoses, expanding the diagnosis efforts overall. This effort may contribute towards virus-control progress, together with other ai based endeavours being developed across the globe, such as use of ai for vaccine development.
-
This convolutional neural network architecture can reasonably also be trained on CT-Scan image data (that many Covid19 papers seem to concern), separate from the Xray data (from the non-Covid19 Pneumonia Kaggle Process) upon which training occurred, initially, apart from the latest Covid19 training sequence on Covid19 data.
-
Countries with aggressive/thorough testing, seem to face lower mortality rates (eg South Korea, <1% mortality rate) versus countries with terrible/barely existent testing/screening, (eg USA >3.5% mortality rate, close to the global mortality rate of ~3.4%). This project serves to contribute to extensive testing efforts, to help minimize potentially exponential spread in newly affected regions, and otherwise aid in control even after wide-spread.
-
On March 19, 2020, Epidemiologist Larry Brilliant, (helped to stop smallpox), says we can beat the novel coronavirus—but first, we need lots more testing.
-
Update: April 7, 2020: Only 6% of actual covid19 infections have been detected by countries worldwide, according to study cited in a new April 7, 2020 Medical Xpress article.
- This means that more than 10 million people could actually be infected by Covid19, contrary to the reported number of roughly 1.4 million cases. This is yet another quite reasonable indication that far more testing is required.
-
Update: April 17, 2020: A US county recorded 1,000 COVID-19 cases earlier this month, but blood tests suggest that more than 50,000 people there have been infected. This is yet another quite reasonable indication that far more testing is required.
An optimal path is reasonably that the (~70% accurate) CDC standard polymerase method, and the (~75% to ~90% accurate) Artificial Intelligence based Xray method are used in concert.
Above is a snippet of my (March 30, 2020) paper submission to the 65th Annual Health Research Conference, organized by the Government of Jamaica and Caribbean Public Health Agency, based on my February 9, 2020 Covid19 Artificial Intelligence diagnostic model.
65th Annual Health Research Conference: http://conference.carpha.org/
See also my more detailed manuscript on research gate.
Based on suggestions by Andrei Marinescu, Jordan has updated this system such that it does both non-covid19 pneumonia detection and covid19 pneumonia detection, using separate convolutional neural network models, via two different droplist options seen below:
This seeks to increase the robustness of the predictions made by the system.
On the task of Covid19 detection, so far, with the very limited data available, Sensitivity/Specificity/Accuracy are ~85%/~70%/~77% respectively, as seen in this screenshot, (where the model has been trained on a covid19 dataset I organized).
For the task of non-Covid19 pneumonia detection, the new code base has: Sensitivity/Specificity/Accuracy of ~89%/~88%/~89% respectively, as seen in this screenshot.
-
Feb 9, 2020: I discover similarities between Covid19 and known forms of pneumonia, after which I find a few Xray-Images representing positive cases of Covid19 by Chinese authorities, where I decided to perform artificial intelligence based Xray Image Scan diagnostics, by using the images as inputs to an artificial intelligence based pneumonia diagnosis method originally published on kaggle. This reasoning is seen in my research/discovery process in the Deep Learning Code section below.
- This is the first known global attempt/publication of image analysis/Artificial Intelligence based nCov/Covid19 diagnosis Open-Source code.
-
Feb 19, 2020: Scientists reveal a ~98% accuracy in human/radiology based CT Scan image based diagnostics, compared to the popular Dna polymerase chain reaction method by CDC: "In a series of 51 patients with chest CT and RT-PCR assay performed within 3 days, the sensitivity of CT for COVID-19 infection was ~98% compared to RT-PCR sensitivity of ~71% (p<.001)."
-
Feb 20, 2020: Great news - Feb 20 news report published, that Chinese are using Ai to help identify the virus with reported ~99% accuracy, via their own Ai based CT-scan method.
- Unfortunately, unlike this repository started by myself on Feb 9th, no Chinese publication of ai based algorithms seems to have been made to the public to help facilitate global control of covid19/SarsCov2.
-
Feb 26, 2020: Chinese researchers reveal free access to an artificial intelligence based online Covid19 Detection tool, although still, no code nor patient data revealed. As a result detection may be slow for users without good internet connection.
- I still call to have code/data released for enhanced covid19 spread control.
- One reason why China should reasonably release their code and data, is because their trained algorithm and data, while providing good basis, may also be susceptible to race based computation issues, simply due to the reality that most Covid19 patient/data are those of Chinese/race.
- My showcasing of this repository's code, and or my suggested publication of China's ai code may enable further training on data pertaining to race distributions of the target nation where Covid19 screening is applicable/required, as seen in other work that stresses accounts for diversity..
-
March 13, 2020: Kaggle launches large global effort to combat Covid19, with a call to action including data collections of oveer 29,000 Covid19 associated papers.
-
March 16, 2020: Adrian Rosebrock produced a Covid19 detector with ~90% accuracy, and ~80% sensitivity, using keras machine learning library, from a recent covid19 xray dataset released 4 days ago.
-
March 22, 2020: Alexander et. al. release the 3rd known Open-Source Ai based Covid19 detector, after Jordan on Feb 9th, then Adrian on March 16. Multi-class virus/bacterial/normal detection is performed, with 100%/80% in Sensitivity/Accuracy respectively in Covid19 detection task.
- Their March 22 method though multi-class-Deep Learning based, uses a similar method to my Feb 9, binary-class-Deep Learning method, that combined a kaggle-xray pneumonia dataset, with scarce Covid19 data.
-
April 6, 2020: Jeremy Kohn, a machine learning engineer who joined my project on March 18, 2020, posts a curated list of resources for diagnosing COVID-19 based on X-rays and CT scans: RID-COVID (Radiological Image Data for Clinical Open-source Viral Infection Diagnosis).
- Molecular and Serology Tests: Up to 2 days before testing is verified.
- Xray Image Scan + Artificial Intelligence Diagnosis: ~5 minutes (for scan) + A few milliseconds for Ai diagnosis = ~6 minutes total time for diagnosis result including possible image processing.
Coronavirus: Whole world 'must take action', warns WHO
Update Jan 31, 2020/WHO declares the new coronavirus outbreak a Public Health Emergency of International Concern
- WHO's warning should reasonably have come about a week earlier, as advised about a week ago via Chris Martenson, who I also refer to below regarding his 115 million nCov case prediction count.
- Update February 7, 2020: Artificial Intelligence Prediction: In 45 days, ~2.5 billion to be infected, ~52 million of which may die.. See also this detailed forbes report.
- The nCov 2019 (Coronavirus Strain 2019/2020) is spreading rapidly, with a mortality rate between 2% and 4%.
- By comparison, the common flu with a far lower mortality rate of .1%, kills 291,000 to 646,000 per year.
- Things get worse; nCov spreads at ~triple the transmission rate of the common flu.
- Common flu RO = 1.28 (Estimated, transmission rate)
- nCov RO = 2.5 to 3.8 (Estimated transmission rate)
- Recent nCov RO estimate ~4.08!
- Recent nCov 2019/Covid19 incubation period is estimated at 24 days, and a Chinese woman was recently struck down with symptoms after probation period of 15 days according to the sun newspaper!
- Current diagnosis methods may miss the presence of the virus due to faulty dna based comparison methods, where multiple negative test results may occur before positive results are gained. In addition, more doctors (or rather more automated diagnosis methods) can improve identification rates of the virus.
- This ai driven method will reasonably help to stop the exponential growth of the nCov strain.
- 1 more month of exponential nCov growth = ~ 115 million cases, (of which ~ 23 million are potentially life threatening ones) according to an epidemiologist/PhD pathologist.
Code
-
Covid-19/Coronavirus2019/nCov share many similarities with pneumonia. In fact, the time course evolution of a specific strain of covid-19 pneumonia is studied here.
-
There are already existent pneumonia deep learning platforms, including kaggle contents rife with deep learning kernels/solutions, pertaining to pneumonia detection.
-
A pretrained neural network is chosen from google, pertaining to (2). Pretrained model usage is a way to avoid training on the 2 gigabytes of pneumonia/non-pneumonia training set.
- I added a quick function "doOnlineInference" to the code. This is a convenient way to invoke diagnosis on input image.
-
Covid-19 positive xray scans are taken from various covid19 papers, such as this scan of this recent covid-19 paper.
-
Preliminary Conclusion
- This will reasonably work on potential mild-covid-19 pneumonia patients, within ~0 to 4 days of infection, with "repeated pulmonary CTs", where positive findings of pneumonia associated abnormalities are discoverable.
- This will likely work better for patients after ~5 days of infection of covid-19, as abnormalities become distributed across the lungs, where initial CT scans could better discover the Covid-19 markers.
- See the paper's conclusion for the reasoning above.
- Download entire repository, which contains my version of the original code from another code base.
- Download the saved weights: "best_weights.hdf5" from the output section of the base code repository on kaggle (easy to become a member using gmail etc), rename the .h5 file to "best_weights_kaggle_user_pneumonia2_0.hdf5" then ensure both the code and weights are in same place.
- Alternatively, you could download the already renamed weights, from this typically easy to access google drive link of mine.
- Download the 2 gigabytes training/test data from kaggle.
- Download this x ray covid19 dataset that I've collated/organized from Dr. Cohen's collation. Ensure the extracted "xray_dataset_covid19" folder is in the same directory as the python files in this repository.
- Run doOnlineInference function from my version of the original code on any of the test data from the 2 gigabytes kaggle directory, or on the single positive covid-19 example seen in this repository, that was taken from figure 1a of this recent covid-19 paper.
Update: February 18, 2020
-
Except for item (5), follow all instructions from "Code setup (basic user interface)" section above.
-
Run my user interface, which works with my version of the original code from this repository. One can either double click the covid19_ai_diagnoser_ui.py file, or open the file with IDLE, and run there.
-
Notice the log with the results of the neural network's prediction in the text area below the image:
CT Scan Manual Diagnosis and Explosion in infection reports
- By extension, apart from human radiologist detection, perhaps an ai based image detection solution can speed up diagnosis, and help to replace the faulty dna based comparison phase. I've also requested more CT image data from a scientist involved with manual diagnosis using CT scan data.
-
Images from recent covid-19 study: "Emerging Coronavirus 2019-nCoV Pneumonia"
-
Images from recent covid-19 study: "Imaging Profile of the COVID-19 Infection: Radiologic Findings and Literature Review"
-
+21 axial lung images, +11 lateral view lung images, and about +118 coronal view lung images, re Covid19 positive cases, collated by Dr. Joseph Cohen.
- Train with caution, i.e. it is reasonable to select one type of view format, for pretrained model, training process, and inference/testing cycle.
By extension, the tool by researchers at John Hopkins University below, is useful for real time tracking of nCov:
https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Note that despite the ~900+ infection-case number reported via China on January 24, by stark contrast, a medical scientific paper estimated that ~105,000+ infections actually occurred at that time.
The "renderConfusionMetrics" instance in Section D (bottom of "covid19_ai_diagnoser_optimal_model_architecture.py" file) can facilitate training of new covid19 images placed in xray_dataset_covid19/train... and or xray_dataset_covid19/test....
This is done by simply placing your images in the directories above, then changing the "False" parameter to "True", and running the "covid19_ai_diagnoser_optimal_model_architecture.py" file.
- If the last .hdf5 weights parameter is changed, the model_covid19PneumoniaDetector.load_weights parameter will also require change in the same "Section D" only.
renderConfusionMetrics ( model_covid19PneumoniaDetector, test_data_d, test_labels_d, False, train_gen_d, test_gen_d, batch_size, 25, 'covid19_neural_network_weights_jordan.hdf5' )
I call on the Ministry of Health of Jamaica (as well as other countries) to utilize their administrative status to try to acquire more covid19 positive CT scan images (in federated format that excludes patient identity), from China etc, for improving pneumonia based ai systems, like the one that I had prepared since February 9, 2020, which I found to successfully detect covid19 presence in a small covid-19 positive Xray scan sample set found online so far, in a paper by Yuen et al etc.
- Alternatively, the Chinese artificial intelligence algorithm/solution together with the data could be attained using the same administrative method.
- In future scenarios, a "Division of Artificial Intelligence Based Health Development" or sector of artificial intelligence based research should reasonably exist in the Ministry of Health, that could enable Ai solutions to be rapidly researched/developed, to facilitate production of vaccines, and treatment, as seen in a recent example where MIT developed antibiotics based on Ai research/development.
My advice to Ministry of Health (February 17, 2020): https://drive.google.com/file/d/1BNXkKJPZuMx64XzwqFmQEpC5s9-C3tJH/view?usp=sharing
-
Jordan added fix to original author's repository, to enable correct validation. John Chang had inadvertently misdefined some "test_dataGen.flow_from_director" function parameter as a training dataset input, instead of a test dataset input.
-
Jordan updated his version of the original code, such that a compile issue is repaired, in order to facilitate accuracy evaluation of the saved/loaded (in 2 minutes on gtx 1060/i7 cpu) model without invocation of model-training function model.fit, which would take hours on the same machine.
-
Based on Andrei's suggestions, Jordan replaced erroneously labelled CT labels, with X-Ray, that Jordan had initially mis-labelled. This correction is very important, and could influence model architecture later on.
-
Code no longer runs on John Chang's base code. Jordan has written new diagnoser code, to accomodate a new code base.
- For the task of pneumonia detection, the new code base has far higher Sensitivity/Specificity/Accuracy of ~89%/~88%/~89% respectively, as seen in the new screenshot, compared to John Chang's code, which had: sensitivity/recall (~33%), specificity (~67%).
- For the task of Covid19 detection, this is the outcome: Sensitivity/Specificity/Accuracy of ~85%/~70%/~77% respectively; ...where the model has been trained on a covid19 dataset I organized.