-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from learningmatter-mit/refactoring
Merge `refactoring` branch
- Loading branch information
Showing
106 changed files
with
23,031 additions
and
8,485 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,93 @@ | ||
# Compiled source # | ||
################### | ||
*.com | ||
*.class | ||
*.dll | ||
*.exe | ||
*.o | ||
*.pyc | ||
*.so | ||
|
||
# Packages # | ||
############ | ||
# it's better to unpack these files and commit the raw source | ||
# git has its own built in compression methods | ||
*.7z | ||
*.dmg | ||
*.gz | ||
*.iso | ||
*.jar | ||
*.rar | ||
#*.tar | ||
*.zip | ||
|
||
# Logs and databases # | ||
###################### | ||
*.log | ||
*.sql | ||
*.sqlite | ||
|
||
# OS generated files # | ||
###################### | ||
.DS_Store | ||
.DS_Store? | ||
._* | ||
.Spotlight-V100 | ||
.Trashes | ||
ehthumbs.db | ||
Thumbs.db | ||
|
||
# python/jupyter generated files | ||
.ipynb_checkpoints | ||
__pycache__ | ||
*.egg-info | ||
.vscode | ||
.pytest_cache | ||
*runs* | ||
*__pycache__* | ||
*.pyc | ||
|
||
# data files # | ||
*.dat | ||
*.data | ||
*.xyz | ||
*.pdb | ||
*.csv | ||
*.pkl | ||
*.txt | ||
*.mpg | ||
*.traj | ||
*.pickle | ||
*.cif | ||
*.in | ||
*.out | ||
*.data | ||
*.png | ||
*.lammps | ||
__pycache__/ | ||
*tmp_files/* | ||
*.png | ||
|
||
# slurm output files | ||
slurm*.out | ||
|
||
# directory | ||
log/ | ||
debug/ | ||
sandbox*/ | ||
backup | ||
dist/ | ||
sandbox_excited/ | ||
build/ | ||
*runs* | ||
tmp_files/ | ||
|
||
# test files should still be committed | ||
!tests/data/*/*.pkl | ||
!tests/data/*/*.cif | ||
|
||
# tutorial files should still be committed | ||
!tutorials/data/*/*.pkl | ||
!tutorials/data/*/*.cif | ||
!tutorials/data/*/*.txt | ||
!tutorials/*/*.txt | ||
!tutorials/data/*/**/*.csv | ||
|
||
# static files should still be committed | ||
!site/static/**.png | ||
|
||
# test file should still be committed | ||
!tests/resources/*.pkl | ||
!tests/resources/*.cif | ||
# util files should still be committed | ||
!mcmc/utils/data/colors.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
[settings] | ||
known_third_party =ase,catkit,lammps,matplotlib,nff,numpy,pytest,scipy | ||
known_third_party =ase,catkit,lammps,matplotlib,monty,nff,numpy,pandas,pymatgen,pytest,scipy,seaborn,sklearn,torch,tqdm,typing_extensions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,89 +1,148 @@ | ||
# Virtual Surface Site Relaxation-Monte Carlo (VSSR-MC) | ||
[![Tests](https://github.com/learningmatter-mit/surface-sampling/actions/workflows/tests.yml/badge.svg)](https://github.com/learningmatter-mit/surface-sampling/actions/workflows/tests.yml) | ||
[![arXiv](https://img.shields.io/badge/arXiv-2305.07251-blue?logo=arXiv&logoColor=white&logoSize=auto)](https://arxiv.org/abs/2305.07251) | ||
[![Zenodo](https://img.shields.io/badge/data-10.5281/zenodo.7758174-14b8a6?logo=zenodo&logoColor=white&logoSize=auto)](https://zenodo.org/doi/10.5281/zenodo.7758174) | ||
|
||
## Contents | ||
- [Overview](#overview) | ||
- [System requirements](#system-requirements) | ||
- [Setup](#setup) | ||
- [Demo](#demo) | ||
- [Scripts](#scripts) | ||
- [Citation](#citation) | ||
- [Development & Bugs](#development--bugs) | ||
|
||
|
||
# Overview | ||
This is the VSSR-MC algorithm for sampling surface reconstructions. VSSR-MC samples across both compositional and configurational spaces. It can interface with both a neural network potential (through [ASE](https://wiki.fysik.dtu.dk/ase/)) or a classical potential (through ASE or [LAMMPS](https://www.lammps.org/)). It is a key component of the Automatic Surface Reconstruction (AutoSurfRecon) pipeline described in the following work: [Machine-learning-accelerated simulations to enable automatic surface reconstruction](https://doi.org/10.1038/s43588-023-00571-7). | ||
|
||
This is the VSSR-MC algorithm for sampling surface reconstructions. VSSR-MC samples across both compositional and configurational spaces. It can interface with both a neural network potential (through ASE) or a classical potential (through ASE or LAMMPS). It is a key component of the Automatic Surface Reconstruction (AutoSurfRecon) pipeline described in the following work: | ||
|
||
"Machine-learning-accelerated simulations to enable automatic surface reconstruction", by X. Du, J.K. Damewood, J.R. Lunger, R. Millan, B. Yildiz, L. Li, and R. Gómez-Bombarelli. https://doi.org/10.1038/s43588-023-00571-7 | ||
|
||
Please cite us if you find this work useful. Let us know in `issues` if you encounter any problems or have any questions. | ||
|
||
To start, run `git clone [email protected]:learningmatter-mit/surface-sampling.git` to your local directory or a workstation. | ||
|
||
Read through the following in order before running our code. | ||
![Cover image](site/static/vssr_cover_image.png) | ||
|
||
# System requirements | ||
|
||
## Hardware requirements | ||
We recommend a computer with the following specs: | ||
|
||
- RAM: 16+ GB | ||
- CPU: 4+ cores, 3 GHz/core | ||
|
||
We tested out the code on machines with 6+ CPU cores @ 3.0+ GHz/core with 64+ GB of RAM. | ||
To run with a neural network force field, a GPU is recommended. We ran on a single NVIDIA GeForce RTX 2080 Ti 11 GB GPU. The code has been tested on *Linux* Ubuntu 20.04.6 LTS but we expect it to work on other *Linux* distributions. | ||
|
||
# Setup | ||
To start, run `git clone [email protected]:learningmatter-mit/surface-sampling.git` to your local directory or a workstation. | ||
|
||
## Conda environment | ||
We recommend creating a new [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html) environment. Following that, the Python dependencies for the code can be installed. In the `surface-sampling` directory, run the following commands: | ||
```bash | ||
conda create -n vssr-mc python=3.11 | ||
conda activate vssr-mc | ||
conda install -c conda-forge kimpy lammps openkim-models | ||
pip install -e . | ||
``` | ||
> If you're intending to contribute to the code, you can `pip install -e '.[dev]'` instead to also install the development dependencies. | ||
To run with a neural network force field, a GPU is recommended. We ran on a single NVIDIA GeForce RTX 2080 Ti 11 GB GPU. | ||
To run with LAMMPS, add the following to `~/.bashrc` or equivalent with appropriate paths and then `source ~/.bashrc`. `conda` would have installed LAMMPS as a dependency. | ||
```bash | ||
export LAMMPS_COMMAND="/path/to/lammps/src/lmp" | ||
export LAMMPS_POTENTIALS="/path/to/lammps/potentials/" | ||
export ASE_LAMMPSRUN_COMMAND="$LAMMPS_COMMAND" | ||
``` | ||
|
||
## Software requirements | ||
The code has been tested up to commit `02820d339eed6291b6af6ccb809f154ad6244110` on the `master` branch. | ||
The `LAMMPS_COMMAND` should point to the LAMMPS executable, which can be found here: `/path/to/[vssr-mc-env]/bin/lmp`. | ||
The `LAMMPS_POTENTIALS` directory should contain the LAMMPS potential files, which can found here: `/path/to/[surface-sampling-repo]/mcmc/potentials/`. | ||
The `ASE_LAMMPSRUN_COMMAND` should point to the same LAMMPS executable. More information can be found here: [ASE LAMMPS](https://wiki.fysik.dtu.dk/ase/ase/calculators/lammpsrun.html). | ||
|
||
### Operating system | ||
This package has been tested on *Linux* Ubuntu 20.04.6 LTS but we expect it to be agnostic to the *Linux* system version. | ||
If the `conda` installed LAMMPS does not work, you might have to install LAMMPS from source. More information can be found here: [LAMMPS](https://lammps.sandia.gov/doc/Build.html). | ||
|
||
### Conda environment | ||
[Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html) is required. Either Miniconda or Anaconda should be installed. | ||
You might have to re-open/re-login to your terminal shell for the new settings to take effect. | ||
|
||
Following that, the Python dependencies for the code can be installed with the following command | ||
# Demo | ||
A toy demo and other examples can be found in the `tutorials/` folder. | ||
``` | ||
conda env create -f environment.yml | ||
tutorials/ | ||
├── example.ipynb | ||
├── GaN_0001.ipynb | ||
├── Si_111_5x5.ipynb | ||
├── SrTiO3_001.ipynb | ||
├── latent_space_clustering.ipynb | ||
└── tutorials/prepare_surface.ipynb | ||
``` | ||
More data/examples can be found in our [Zenodo dataset](https://doi.org/10.5281/zenodo.7758174). | ||
|
||
Installation might take 10-20 minutes to resolve dependencies. | ||
## Toy example of Cu(100) | ||
A toy example to illustrate the use of VSSR-MC. It should only take about a few seconds to run. Refer to `tutorials/example.ipynb`. | ||
|
||
### Additional software | ||
1. [LAMMPS](https://docs.lammps.org/Install.html) for classical force field optimization | ||
2. [NFF](https://github.com/learningmatter-mit/NeuralForceField) for neural network force field | ||
## GaN(0001) surface sampling with Tersoff potential | ||
This example could take a few minutes to run. Refer to `tutorials/GaN_0001.ipynb`. | ||
|
||
# Setup | ||
Assuming you have cloned our `surface-sampling` repo to `/path/to/surface-sampling`. | ||
## Si(111) 5x5 surface sampling with modified Stillinger–Weber potential | ||
This example could take a few minutes to run. Refer to `tutorials/Si_111_5x5.ipynb`. | ||
|
||
Add the following to `~/.bashrc` or equivalent with appropriate paths and then `source ~/.bashrc`. | ||
``` | ||
export SURFSAMPLINGDIR="/path/to/surface-sampling" | ||
export PYTHONPATH="$SURFSAMPLINGDIR:$PYTHONPATH" | ||
## SrTiO3(001) surface sampling with machine learning potential | ||
Demonstrates the integration of VSSR-MC with a neural network force field. This example could take a few minutes to run. Refer to `tutorials/SrTiO3_001.ipynb`. | ||
|
||
export LAMMPS_COMMAND="/path/to/lammps/src/lmp_serial" | ||
export LAMMPS_POTENTIALS="/path/to/lammps/potentials/" | ||
export ASE_LAMMPSRUN_COMMAND="$LAMMPS_COMMAND" | ||
## Clustering MC-sampled surfaces in the latent space | ||
Retrieves the neural network embeddings of VSSR-MC structures and performs clustering. This example should only take a minute to run. Refer to `tutorials/latent_space_clustering.ipynb`. | ||
|
||
export NFFDIR="/path/to/NeuralForceField" | ||
export PYTHONPATH=$NFFDIR:$PYTHONPATH | ||
``` | ||
## Preparing surface from a bulk structure | ||
This example demonstrates how to cut a surface from a bulk structure. Refer to `tutorials/prepare_surface.ipynb`. | ||
|
||
You might have to re-open/re-login to your shell for the new settings to take effect. | ||
|
||
# Demo | ||
# Scripts | ||
Scripts can be found in the `scripts/` folder, including: | ||
``` | ||
scripts/ | ||
├── sample_surface.py | ||
└── clustering.py | ||
``` | ||
|
||
A toy demo and other examples can be found in the `tutorials/` folder. More data/examples can be found in our Zenodo dataset (https://doi.org/10.5281/zenodo.7758174). | ||
The arguments for the scripts can be found by running `python scripts/sample_surface.py -h` or `python scripts/clustering.py -h`. | ||
|
||
## Example usage: | ||
### Original VSSR-MC with PaiNN model trained on SrTiO3(001) surfaces | ||
```bash | ||
python scripts/sample_surface.py --run_name "SrTiO3_001_painn" \ | ||
--starting_structure_path "tutorials/data/SrTiO3_001/SrTiO3_001_2x2_pristine_slab.pkl" \ | ||
--model_type "PaiNN" --model_paths "tutorials/data/SrTiO3_001/nff/model01/best_model" \ | ||
"tutorials/data/SrTiO3_001/nff/model02/best_model" \ | ||
"tutorials/data/SrTiO3_001/nff/model03/best_model" \ | ||
--settings_path "scripts/configs/sample_config_painn.json" | ||
``` | ||
|
||
### Toy example of Cu(100) | ||
A toy example to illustrate the use of VSSR-MC. It should only take about a minute to run. Refer to `tutorials/example.ipynb`. | ||
### Pre-trained "foundational" CHGNet model on SrTiO3(001) surfaces | ||
```bash | ||
python scripts/sample_surface.py --run_name "SrTiO3_001_chgnet" \ | ||
--starting_structure_path "tutorials/data/SrTiO3_001/SrTiO3_001_2x2_pristine_slab.pkl" \ | ||
--model_type "CHGNetNFF" --settings_path "scripts/configs/sample_config_chgnet.json" | ||
``` | ||
|
||
### GaN(0001) surface sampling with Tersoff potential | ||
We explicitly generate surface sites using `pymatgen`. This example could take 5 minutes or more to run. Refer to `tutorials/GaN_0001.ipynb`. | ||
### Latent space clustering | ||
```bash | ||
python scripts/clustering.py --file_paths "tutorials/data/SrTiO3_001/SrTiO3_001_2x2_mcmc_structures_100.pkl" \ | ||
--save_folder "SrTiO3_001/clustering" --nff_model_type "PaiNN" \ | ||
--nff_paths "tutorials/data/SrTiO3_001/nff/model01/best_model" \ | ||
"tutorials/data/SrTiO3_001/nff/model02/best_model" \ | ||
"tutorials/data/SrTiO3_001/nff/model03/best_model" \ | ||
--clustering_metric "force_std" --cutoff_criterion "distance" \ | ||
--clustering_cutoff 0.2 --nff_device "cuda" | ||
``` | ||
|
||
### Si(111) 5x5 surface sampling with modified Stillinger–Weber potential | ||
We explicitly generate surface sites using `pymatgen`. This example could take 5 minutes or more to run. Refer to `tutorials/Si_111_5x5.ipynb`. | ||
|
||
### SrTiO3(001) surface sampling with machine learning potential | ||
Demonstrates the integration of VSSR-MC with a neural network force field. This example could take 10 minutes or more to run. Refer to `tutorials/SrTiO3_001.ipynb`. | ||
# Citation | ||
```bib | ||
@article{duMachinelearningacceleratedSimulationsEnable2023, | ||
title = {Machine-Learning-Accelerated Simulations to Enable Automatic Surface Reconstruction}, | ||
author = {Du, Xiaochen and Damewood, James K. and Lunger, Jaclyn R. and Millan, Reisel and Yildiz, Bilge and Li, Lin and {G{\'o}mez-Bombarelli}, Rafael}, | ||
year = {2023}, | ||
month = dec, | ||
journal = {Nature Computational Science}, | ||
pages = {1--11}, | ||
publisher = {Nature Publishing Group}, | ||
issn = {2662-8457}, | ||
doi = {10.1038/s43588-023-00571-7}, | ||
urldate = {2023-12-07}, | ||
keywords = {Computational methods,Computational science,Software,Surface chemistry} | ||
} | ||
``` | ||
|
||
### Clustering MC-sampled surfaces in the latent space | ||
Retrieving the neural network embeddings of VSSR-MC structures and performing clustering. This example should only take a minute to run. Refer to `tutorials/latent_space_clustering.ipynb`. | ||
# Development & Bugs | ||
VSSR-MC is under active development, if you encounter any bugs in installation and usage, | ||
please open an [issue](https://github.com/learningmatter-mit/surface-sampling/issues). We appreciate your contributions! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
cff-version: 1.2.0 | ||
message: If you use this software, please cite it as below. | ||
title: Machine-Learning-Accelerated Simulations to Enable Automatic Surface Reconstruction | ||
authors: | ||
- family-names: Du | ||
given-names: Xiaochen | ||
- family-names: Damewood | ||
given-names: James K. | ||
- family-names: Lunger | ||
given-names: Jaclyn R. | ||
- family-names: Millan | ||
given-names: Reisel | ||
- family-names: Yildiz | ||
given-names: Bilge | ||
- family-names: Li | ||
given-names: Lin | ||
- family-names: {G{\'o}mez-Bombarelli} | ||
given-names: Rafael | ||
date-released: 2023-11-08 | ||
repository-code: https://github.com/learningmatter-mit/surface-sampling | ||
arxiv: https://arxiv.org/abs/2305.07251 | ||
doi: 10.1038/s43588-023-00571-7 | ||
type: software | ||
keywords: | ||
[monte carlo, neural network, force field, active learning] | ||
version: 0.1.0 # replace with the version you use | ||
journal: Nature Computational Science |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,28 +1,8 @@ | ||
name: surface_sampling | ||
name: | ||
- vssr-mc | ||
channels: | ||
- conda-forge | ||
- pytorch | ||
- nvidia | ||
- defaults | ||
dependencies: | ||
- flake8 | ||
- python=3.8 | ||
- pytorch=2.0 | ||
- pytorch-cuda=11.7 | ||
- matplotlib | ||
- numpy>=1.21.6,<=1.22.4 | ||
- pandas | ||
- pre-commit | ||
- pylint | ||
- ipykernel | ||
- notebook | ||
- ase | ||
- pymatgen=2023.5.10 | ||
- rdkit | ||
- e3fp | ||
- scikit-learn | ||
- lammps | ||
- kimpy | ||
- openkim-models | ||
- pip | ||
- pip: | ||
- git+https://github.com/SUNCAT-Center/CatKit.git |
Oops, something went wrong.