A curated list of awesome Python frameworks, libraries, software and resources related to Chemistry.
Inspired by awesome-python.
Packages and tools for general chemistry.
- AQME - Ensemble of automated QM workflows that can be run through jupyter notebooks, command lines and yaml files.
- aizynthfinder - A tool for retrosynthetic planning.
- batchcalculator - A GUI app based on wxPython for calculating the correct amount of reactants (batch) for a particular composition given by the molar ratio of its components.
- cctbx - The Computational Crystallography Toolbox.
- ChemFormula - ChemFormula provides a class for working with chemical formulas. It allows parsing chemical formulas, calculating formula weights, and generating formatted output strings (e.g. in HTML, LaTeX, or Unicode).
- chemlib - A robust and easy-to-use package that solves a variety of chemistry problems.
- chempy - ChemPy is a package useful for chemistry (mainly physical/inorganic/analytical chemistry).
- datamol: - Molecular Manipulation Made Easy. A light wrapper build on top of RDKit.
- GoodVibes - A Python program to compute quasi-harmonic thermochemical data from Gaussian frequency calculations.
- hgraph2graph - Hierarchical Generation of Molecular Graphs using Structural Motifs.
- ionize - Calculates the properties of individual ionic species in aqueous solution, as well as aqueous solutions containing arbitrary sets of ions.
- LModeA-nano - Calculates the intrinsic chemical bond strength based on local vibrational mode theory in solids and molecules.
- mendeleev - A package that provides a python API for accessing various properties of elements from the periodic table of elements.
- nmrglue - A package for working with nuclear magnetic resonance (NMR) data including functions for reading common binary file formats and processing NMR data.
- Open Babel - A chemical toolbox designed to speak the many languages of chemical data.
- periodictable - This package provides a periodic table of the elements with support for mass, density and xray/neutron scattering information.
- propka - Predicts the pKa values of ionizable groups in proteins and protein-ligand complexes based in the 3D structure.
- pybaselines - A package for fitting baselines of spectra for baseline correction.
- pybel - Pybel provides convenience functions and classes that make it simpler to use the Open Babel libraries from Python.
- pycroscopy - Scientific analysis of nanoscale materials imaging data.
- pyEQL - A set of tools for conventional calculations involving solutions (mixtures) and electrolytes.
- pyiron - pyiron - an integrated development environment (IDE) for computational materials science.
- pymatgen - Python Materials Genomics is a robust, open-source library for materials analysis.
- pymatviz - A toolkit for visualizations in materials informatics.
- symfit - a curve-fitting library ideally suited to chemistry problems, including fitting experimental kinetics data.
- symmetry - Symmetry is a library for materials symmetry analysis.
- stk - A library for building, manipulating, analyzing and automatic design of molecules, including a genetic algorithm.
- spectrochempy - A library for processing, analyzing and modeling spectroscopic data.
Packages and tools for employing machine learning and data science in chemistry.
- amp - Is an open-source package designed to easily bring machine-learning to atomistic calculations.
- atom3d - Enables machine learning on three-dimensional molecular structure.
- chainer-chemistry - A deep learning framework (based on Chainer) with applications in Biology and Chemistry.
- chemml - A machine learning and informatics program suite for the analysis, mining, and modeling of chemical and materials data.
- chemprop - Message Passing Neural Networks for Molecule Property Prediction .
- cgcnn - Crystal graph convolutional neural networks for predicting material properties.
- deepchem - Deep-learning models for Drug Discovery and Quantum Chemistry.
- DeepPurpose - A Deep Learning Library for Compound and Protein Modeling DTI, Drug Property, PPI, DDI, Protein Function Prediction.
- DescriptaStorus - Descriptor computation (chemistry) and (optional) storage for machine learning.
- DScribe - Descriptor library containing a variety of fingerprinting techniques, including the Smooth Overlap of Atomic Positions (SOAP).
- graphein - Provides functionality for producing geometric representations of protein and RNA structures, and biological interaction networks.
- Matminer - Library of descriptors to aid in the data-mining of materials properties, created by the Lawrence Berkeley National Laboratory.
- MoleOOD - a robust molecular representation learning framework against distribution shifts.
- megnet - Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals.
- MAML - Aims to provide useful high-level interfaces that make ML for materials science as easy as possible.
- MORFEUS - Library for fast calculations of molecular features from 3D structures for machine learning with a focus on steric descriptors.
- olorenchemengine - Molecular property prediction with unified API for diverse models and respresentations, with integrated uncertainty quantification, interpretability, and hyperparameter/architecture tuning.
- ROBERT - Ensemble of automated machine learning protocols that can be run sequentially through a single command line. The program works for regression and classification problems.
- schnetpack - Deep Neural Networks for Atomistic Systems.
- selfies - Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation.
- Summit - Package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).
- TDC - Therapeutics Data Commons (TDC) is the first unifying framework to systematically access and evaluate machine learning across the entire range of therapeutics.
- XenonPy - Library with several compositional and structural material descriptors, along with a few pre-trained neural network models of material properties.
Packages and tools for generating molecular species
- GraphINVENT - A platform for graph-based molecular generation using graph neural networks.
- GuacaMol - A package for benchmarking of models for de novo molecular design.
- moses - A benchmarking platform for molecular generation models.
- perses - Experiments with expanded ensembles to explore chemical space.
Packages for atomistic simulations and computational chemistry.
- alchemlyb - Makes alchemical free energy calculations easier by leveraging the full power and flexibility of the PyData stack.
- atomate2 - atomate2 is a library of computational materials science workflows.
- Atomic Silumation Environment (ASE) - Is a set of tools and modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.
- basis_set_exchange - A library containing basis sets for use in quantum chemistry calculations. In addition, this library has functionality for manipulation of basis set data.
- CACTVS - Cactvs is a universal, scriptable cheminformatics toolkit, with a large collection of modules for property computation, chemistry data file I/O and other tasks.
- CalcUS - Quantum chemisttry web platform that brings all the necessary tools to perform quantum chemistry in a user-friendly web interface.
- cantera - A collection of object-oriented software tools for problems involving chemical kinetics, thermodynamics, and transport processes.
- CatKit - General purpose tools for high-throughput catalysis.
- ccinput - A tool and library for creating quantum chemistry input files.
- cclib - A library for parsing output files various quantum chemical programs.
- cinfony - A common API to several cheminformatics toolkits (Open Babel, RDKit, the CDK, Indigo, JChem, OPSIN and cheminformatics webservices).
- chemlab - Is a library that can help the user with chemistry-relevant calculations.
- emmet - A package to 'build' collections of materials properties from the output of computational materials calculations.
- fromage - The "FRamewOrk for Molecular AGgregate Excitations" enables localised QM/QM' excited state calculations in a solid state environment.
- GPAW - Is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE).
- horton - Helpful Open-source Research TOol for N-fermion system, a quantum-chemistry program that can perform computations involving model Hamiltonians.
- HTMD - High-Throughput Molecular Dynamics: Programming Environment for Molecular Discovery.
- Indigo - Universal cheminformatics libraries, utilities and database search tools.
- Jarvis-tools - An open-access software package for atomistic data-driven materials design
- mathchem - Is a free open source package for calculating topological indices and other invariants of molecular graphs.
- MDAnalysis - Is an object-oriented library to analyze trajectories from molecular dynamics (MD) simulations in many popular formats.
- MDTraj - Package for manipulating molecular dynamics trajectories with support for multiple formats.
- MEXPLORER - Analyze and visualize complex chemical reaction mechanisms.
- MMTK - The Molecular Modeling Toolkit is an Open Source program library for molecular simulation applications.
- MolMod - A library with many components that are useful to write molecular modeling programs.
- nmrsim - A library for simulating first- or second-order NMR spectra and dynamic NMR resonances.
- oddt - Open Drug Discovery Toolkit, a modular and comprehensive toolkit for use in cheminformatics, molecular modeling etc.
- OPEM - Open source PEM (Proton Exchange Membrane) fuel cell simulation tool.
- openmmtools - A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
- overreact - A library and command-line tool for building and analyzing complex homogeneous microkinetic models from quantum chemistry calculations, with support for quasi-harmonic thermochemistry, quantum tunnelling corrections, molecular symmetries and more.
- ParmEd - Parameter/topology editor and molecular simulator with visualization capability.
- pGrAdd - A library for estimating thermochemical properties of molecules and adsorbates using group additivity.
- phonopy - An open source package for phonon calculations at harmonic and quasi-harmonic levels.
- PLAMS - Python Library for Automating Molecular Simulation: input preparation, job execution, file management, output processing and building data workflows.
- pMuTT - A library for ab-initio thermodynamic and kinetic parameter estimation.
- PorePy - A Simulation Tool for Fractured and Deformable Porous Media.
- ProDy - An open source package for protein structural dynamics analysis with a flexible and responsive API.
- ProLIF - Interaction Fingerprints for protein-ligand complexes and more.
- Psi4 - A hybrid Python/C++ open-source package for quantum chemistry.
- Psi4NumPy - Psi4-based reference implementations and Jupyter notebook-based tutorials for foundational quantum chemistry methods.
- pyEMMA - Library for the estimation, validation and analysis Markov models of molecular kinetics and other kinetic and thermodynamic models from molecular dynamics data.
- pygauss - An interactive tool for supporting the life cycle of a computational molecular chemistry investigations.
- PyQuante - Is an open-source suite of programs for developing quantum chemistry methods.
- pysic - A calculator incorporating various empirical pair and many-body potentials.
- Pyscf - A quantum chemistry package written in Python.
- pyvib2 - A program for analyzing vibrational motion and vibrational spectra.
- RDKit - Open-Source Cheminformatics Software.
- ReNView - A program to visualize reaction networks.
- stk - A library for building, manipulating, analyzing and automatic design of molecules.
- QMsolve - A module for solving and visualizing the Schrödinger equation.
- QUIP - A collection of software tools to carry out molecular dynamics simulations.
- torchmd - End-To-End Molecular Dynamics (MD) Engine using PyTorch.
- tsase - The library which depends on ASE to tackle transition state calculations.
- yank - An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.
Packages related to force fields
- CHGNet - Pretrained universal neural network potential for charge-informed atomistic modeling.
- FitSNAP - A Package For Training SNAP Interatomic Potentials for use in the LAMMPS molecular dynamics package.
- fftool - Tool to build force field input files for molecular simulation.
- FLARE - A package for creating fast and accurate interatomic potentials.
- global-chem - A Chemical Knowledge Graph and Toolkit, writting in IUPAC/SMILES/SMARTS, for common small molecules from diverse communities to aid users in selecting compounds for forcefield parametirization.
- matbench-discovery - A benchmark for ML-guided high-throughput materials discovery.
- NeuralForceField - Neural Network Force Field based on PyTorch.
- openff-toolkit - The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools.
Packages for viewing molecular structures.
- ase-gui - The graphical user-interface allows users to visualize, manipulate, and render molecular systems and atoms objects.
- chemiscope - An interactive structure/property explorer for materials and molecules.
- chemview - An interactive molecular viewer designed for the IPython notebook.
- imolecule - An embeddable webGL molecule viewer and file format converter.
- moleculekit - A molecule manipulation library.
- nglview - A Jupyter widget to interactively view molecular structures and trajectories.
- PyMOL - A user-sponsored molecular visualization system on an open-source foundation, maintained and distributed by Schrödinger.
- pymoldyn - A viewer for atomic clusters, crystalline and amorphous materials in a unit cell corresponding to one of the seven 3D Bravais lattices.
- rdeditor - Simple RDKit molecule editor GUI using PySide.
- sumo - A toolkit for plotting and analysis of ab initio solid-state calculation data.
- surfinpy - A library for the analysis, plotting and visualisation of ab initio surface calculation data.
- trident-chemwidgets - Jupyter Widgets to interact with molecular datasets.
Providing a python layer for accessing chemical databases
- ccdc - An API for the Cambridge Structural Database System.
- ChemSpiPy - ChemSpider wrapper, that allows chemical searches, chemical file downloads, depiction and retrieval of chemical properties.
- CIRpy - An interface for the Chemical Identifier Resolver (CIR) by the CADD Group at the NCI/NIH.
- pubchempy - PubChemPy provides a way to interact with PubChem in Python.
- chembl-downloader - Automate downloading and querying the latest (or a given) version of ChEMBL
- drugbank-downloader - Automate downloading, opening, and parsing DrugBank
Resources for learning to apply python to chemistry.
- An Introduction to Applied Bioinformatics - A Jupyter book demonstrating working with biochemical data using the scikit-bio library for tasks such as sequence alignment and calculating Hamming distances.
- Computational Thermodynamics - This collection of Jupyter notebooks demonstrates solutions to a range of thermodynamic problems including solving chemical equilibria, comparing real versus ideal gas behavior, and calculating the temperature and composition of a combustion reaction.
- SciCompforChemists - Scientific Computing for Chemists with Python is a Jupyter book teaching basic python in chemistry skills, including relevant libraries, and applies them to solving chemical problems.
- Colorful Nuclide Chart - A beatuful, interactive visualization of nuclides with access to a varirty of nuclear properties and allows saving high quality images for publications, presentations and outreach.
- awesome-cheminformatics Another list focuses on Cheminformatics, including tools not only in Python.
- awesome-small-molecule-ml A collection of papers, datasets, and packages for small-molecule drug discovery. Most links to code are in Python.
- awesome-molecular-docking A curated list of molecular docking software, datasets, and papers.
- jarvis Joint Automated Repository for Various Integrated Simulations is a repository designed to automate materials discovery and optimization using classical force-field, density functional theory, machine learning calculations and experiments.
- polypharmacy-ddi-synergy-survey A collection of research papers (with Python implementations) focusing on drug-drug interactions, synergy and polypharmacy.