The scientific literature mining project aims to collate a collection of articles in a specific domain and extract information of interests. This project provides a web interface so that users can easily search and harvest needed information from the article repository.
- Taxonomic hierachy
- Article summary (under development)
- Experiment summary (under development)
These instructions will guide you to properly run the project.
Install the following packages.
- ChemDataExtractor (https://github.com/cjcourt/cdesnowball) - chemical literature text preprocessor.
- Periodic Table (http://www.reflectometry.org/danse/docs/elements/guide/using.html) - periodic table of the elements.
- lxml (https://lxml.de/installation.html) - XML parser.
- PyMuPDF (https://github.com/pymupdf/PyMuPDF) - PDF parser.
- Scikit-learn (https://scikit-learn.org) - Python machine learning package.
Get API keys from publishers.
- Elsevier (https://dev.elsevier.com/)
- NCBI (https://www.ncbi.nlm.nih.gov/pmc/tools/developers/) - National Center for Biotechnology Information
- Springer (https://dev.springernature.com/)
- Wiley (https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining)
- Django - The web framework used
This project is licensed under the MIT License - see the LICENSE.md file for details
- This work has been authored by employees of Brookhaven Science Associates, LLC operated under Contract No. DESC0012704. The authors gratefully acknowledge the funding support from the Brookhaven National Laboratory under the Laboratory Directed Research and Development 18-05 FY 18-20.