Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge refactoring branch #2

Merged
merged 161 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
161 commits
Select commit Hold shift + click to select a range
586b0cf
new SurfaceSystem class
xiaochendu Feb 6, 2024
8ddf04e
add custom surface energy calculators and refactor energy calculation
xiaochendu Feb 9, 2024
bcdf043
implement LAMMMPSCalc and partial integration into energy.py
xiaochendu Feb 11, 2024
a5774ce
add TrajectoryObserver (not fully integrated yet)
xiaochendu Feb 11, 2024
7b6a83a
comment out redundant code and bug fixes
xiaochendu Feb 13, 2024
3443db9
tidy up code and settings dicts
xiaochendu Feb 14, 2024
8afda37
algorithm sketches for acceptance criterion and event
xiaochendu Mar 8, 2024
7ce7c39
docstrings and offset units for Calculators
xiaochendu Mar 12, 2024
b3eb386
save relaxation trajectories
xiaochendu Mar 12, 2024
20b310e
fix initialize with occupation/states
xiaochendu Mar 12, 2024
4a9c055
fix tutorials sampling with LAMMPS template and tests
xiaochendu Mar 12, 2024
0ead273
active learning scripts in testing
xiaochendu Mar 17, 2024
5643dfb
BLD: add clustering and related utils, move to new utils folder
xiaochendu Mar 20, 2024
2f7268b
BLD: add LAMMPSRunSurfCalc for lammpsrun way of calling LAMMPS
xiaochendu Mar 29, 2024
c83c399
MAINT: standardize tutorial formats and test
xiaochendu Mar 29, 2024
f36c52c
BLD: change `get_atoms_batch` helper function
xiaochendu Mar 29, 2024
d9fbb97
MAINT: fix imports
xiaochendu Mar 29, 2024
882f77c
STY: remove commented out lines
xiaochendu Mar 29, 2024
67782d7
ENH & MAINT: refactor out some utility functions, standardize save_fo…
xiaochendu Mar 29, 2024
c480e23
BLD: add sampling script example
xiaochendu Mar 29, 2024
654edb2
MAINT: change tests resources folder to data
xiaochendu Mar 29, 2024
0e11f0c
BUG: fix removed import (my bad)
xiaochendu Mar 29, 2024
c3d3af2
MAINT: standardize arguments and fix save path typo
xiaochendu Apr 9, 2024
1bc125f
DOC: write docstrings for SurfaceSystem
xiaochendu Apr 19, 2024
c6946fe
MAINT: remove `require_large_offsets`
xiaochendu Apr 19, 2024
8582aca
MAINT: move htvs perturbation method to project
xiaochendu Apr 19, 2024
ed83df4
ENH: print full path of saved pkl
xiaochendu Apr 19, 2024
6746519
BLD: add general surface sampling script with PaiNN NFF
xiaochendu Apr 19, 2024
d5cc884
BUG: Fixed save path for mcmc calculations
Apr 26, 2024
acd9ad6
ENH: improve clustering speed by not using dense_nbrs
xiaochendu Apr 30, 2024
fe7b57e
MAINT: add final energy as output after ase relaxations
xiaochendu Apr 30, 2024
ea8cd18
ENH: moving energy calculation out of slab_energy
xiaochendu May 20, 2024
5470512
REV: remove forces placeholder
xiaochendu May 21, 2024
117dcf9
ENH: remove `slab_energy`
xiaochendu May 21, 2024
668e8b2
BUG: fix `SurfaceSystem.get_surface_energy`
xiaochendu May 21, 2024
3ddf996
TST: update tests
xiaochendu May 22, 2024
04db037
DOC: update tutorials
xiaochendu May 22, 2024
d24165f
TST: remove reference to personal directories
xiaochendu May 22, 2024
da046c3
MAINT & DOC: refactor `change_site`
xiaochendu May 23, 2024
85fd623
TST & DOC: update tests and tutorials
xiaochendu May 23, 2024
c32e8e8
ENH: change method of getting surface potential energies
xiaochendu May 24, 2024
3669bd9
ENH: add staticmethod to save calculator results on call
xiaochendu May 24, 2024
ffc4a28
MAINT & ENH: move semigrand sampling to separate Event, Proposal, and…
xiaochendu May 24, 2024
36413fe
TST: wrote a lot more tests for various critical parts of the code
xiaochendu May 26, 2024
f9232a1
DOC & MAINT: improve docstrings and format, adding some checks
xiaochendu May 26, 2024
5a035c2
BUG: SurfaceSystem surface energy check and really allow no calc to b…
xiaochendu May 26, 2024
27e0b1a
BLD: initial Pourbaix sampling commit
xiaochendu May 26, 2024
45afc88
BUG: use pymatgen data for consistency and change species concentrations
xiaochendu Jun 5, 2024
1e72201
DOC & BUG: update description to be more precise
xiaochendu Jun 5, 2024
8f81acb
MAINT: minor changes in kwargs for calc_settings
xiaochendu Jun 6, 2024
b2fb97b
FEAT: Added Surface Depth to SurfaceSystem, merged with master
Jun 7, 2024
bc0a9f1
TEST: added tests for SurfaceSystem's surface_depth argument
Jun 14, 2024
f56c492
ENH & DOC: add SwitchProposal and update docstrings
xiaochendu Jun 15, 2024
80f24ad
ENH & MAINT: add Exchange Event and move acceptance method to base class
xiaochendu Jun 15, 2024
1fe9f91
ENH & MAINT: add backward method and minor docstring reformatting
xiaochendu Jun 15, 2024
6af7d12
DOC & MAINT: add docstring to SwitchProposal, minor debugging and var…
xiaochendu Jun 15, 2024
692efef
BUG & MAINT: fix missing imports and minor variable changes
xiaochendu Jun 15, 2024
2ed9e4e
DEP: deprecate rmsd_criterion
xiaochendu Jun 15, 2024
7bb3a29
ENH: refactor change_site_canonical to new SwitchProposal, Exchange E…
xiaochendu Jun 15, 2024
ccf7a9c
BUG: fix backward in Event
xiaochendu Jun 17, 2024
587b583
TST: add test_change_backward to test_event
xiaochendu Jun 17, 2024
52ca925
DOC & MAINT: added docstring and type hints for slab.py and minor ref…
xiaochendu Jun 17, 2024
fd24079
ENH & MAINT: Generalize Pourbaix sampling to different domains
xiaochendu Jun 19, 2024
c3222ce
TST: added tests for PourbaixAtom generation
xiaochendu Jun 19, 2024
e057719
TST: add tests for Exchange Event and SwitchProposal
xiaochendu Jun 19, 2024
676ea47
MAINT: improve save folder naming and run arguments
xiaochendu Jun 20, 2024
72168ec
MAINT: improvements to constraints and specifying/generating adsorpti…
xiaochendu Jun 20, 2024
83c3c44
MAINT: surface layer adsorbates are included in adsorption sites with…
xiaochendu Jun 20, 2024
69aff29
STY: minor style changes
xiaochendu Jun 20, 2024
445d9af
MAINT: update SurfaceSystem intialization arguments
xiaochendu Jun 20, 2024
c4dbf20
BUG: fix self.num_pristine_atoms initialization in SurfaceSystem
xiaochendu Jun 20, 2024
930af4c
MAINT: refactor get_complementary_idx in slab.py
xiaochendu Jun 20, 2024
eb3fbd1
MAINT: distance_weight_matrix moved into SurfaceSystem
xiaochendu Jun 20, 2024
702f47c
TST: add tests for test_get_complementary_idx and helper function for…
xiaochendu Jun 20, 2024
65243a0
PERF: Changed calculator behavior for SurfaceSystem test cases
Jun 21, 2024
4e4a2bb
Merge branch 'master' into edaloz_surfsamp
xiaochendu Jun 22, 2024
0968cd6
Merge pull request #2 from MLMat/edaloz_surfsamp
xiaochendu Jun 22, 2024
da6f3b1
DOC: add/correct docstrings for event.py and misc.py
xiaochendu Jun 22, 2024
a44fb9b
MAINT: per-atom energies are directly taken from calculator inside co…
xiaochendu Jun 22, 2024
8f73cda
MAINT: update Si 111 tutorial with distance decay and per atom energies
xiaochendu Jun 22, 2024
d8c28f0
Merge remote-tracking branch 'origin/master' into refactoring
xiaochendu Jun 22, 2024
d01bd47
TST: update tests and tutorials with new SurfaceSystem constraints
xiaochendu Jun 22, 2024
1fed72a
MAINT: remove large swaths of commented out sections in mcmc.py
xiaochendu Jun 22, 2024
dddd7cd
MAINT: move save structure function after each MC sweep and after ene…
xiaochendu Jun 23, 2024
95885d1
STY: ruff format test_system
xiaochendu Jun 23, 2024
fc578c6
TST: update SurfaceSystem save and restore with atomic positions
xiaochendu Jun 23, 2024
74aa62c
ENH: add ads_pos_type as a system_settings parameter
xiaochendu Jun 24, 2024
cd73143
DOC: reformatted system docstring from numpy to google style
xiaochendu Jun 24, 2024
38959c3
MAINT: added refactoring TODOs
xiaochendu Jun 24, 2024
db4a6e3
MAINT: create setup.py
xiaochendu Jun 25, 2024
3ea6ef3
Merge branch 'refactoring' into pourbaix
xiaochendu Jun 26, 2024
40cad7a
MAINT: temporarily put back self.trajectories in MCMC object
xiaochendu Jun 27, 2024
9b19369
BUG: fix test_system.py surface_system fixture after merging with ref…
xiaochendu Jun 27, 2024
47f7828
MAINT: update sample_pourbaix_surface.py
xiaochendu Jun 27, 2024
580b5fa
DEP: deprecate get_adsorption_coords and remove it
xiaochendu Jun 27, 2024
9f290c3
MAINT & DOC: move create_anneal_schedule to `utils/sampling.py`
xiaochendu Jun 28, 2024
5d6937f
TST: update tutorials and "tests"
xiaochendu Jun 28, 2024
9452507
ENH: add additional argument to `sample_pourbaix_surface.py`
xiaochendu Jun 28, 2024
55fb1ab
TST: update tests to follow new fix atoms Constraints
xiaochendu Jun 28, 2024
fee856f
BUG: update .gitignore to include missing test files and upload them
xiaochendu Jun 28, 2024
0140f46
Merge branch 'master' into refactoring
xiaochendu Jun 28, 2024
19b5d97
MAINT: consolidate MCMC and SurfaceSystem arguments
xiaochendu Jun 28, 2024
5467980
TST: improve MCMC sampling tests
xiaochendu Jun 28, 2024
659ae91
TST: remove GaN and Si "tests", update SrTiO3 tests to follow new Sur…
xiaochendu Jun 28, 2024
acf935b
MAINT: update gitignore to include .log and .csv files
xiaochendu Jun 28, 2024
db91c2f
MAINT: move temperature setup, prepare_canonical to initialize(), ren…
xiaochendu Jun 28, 2024
6f223af
MAINT: bring MCMC results outside of object and move plot to outside …
xiaochendu Jun 28, 2024
c9de4be
STY & BLD: update pre-commit-config and initial pyproject.toml
xiaochendu Jun 29, 2024
48f8ba3
MAINT & DOC: refactored SurfaceSystem and revamped copy and todict + …
xiaochendu Jun 29, 2024
4f85363
TST & DOC: update tests and tutorial docs based on the previous commit
xiaochendu Jun 29, 2024
6832170
TST & DOC: update tests and tutorial docs based on the previous commit
xiaochendu Jun 29, 2024
9b0b3f1
Merge branch 'refactoring' of github.mit.edu:MLMat/surface_sampling i…
xiaochendu Jun 29, 2024
47be1c8
MAINT: remove saving and evaluating curr_energy in MCMC
xiaochendu Jun 29, 2024
59b95c0
TST & DOC: update tests and tutorials since self.relax is no longer p…
xiaochendu Jun 29, 2024
5534bef
ENH & MAINT & STY: implement get_logger
xiaochendu Jun 30, 2024
28596e2
MAINT & DOC: renamed methods and moved functions/classes into more se…
xiaochendu Jun 30, 2024
99b0916
DEV: update gitignore
xiaochendu Jul 1, 2024
9da39c8
DEV: update gitignore with more comprehensive list
xiaochendu Jul 1, 2024
842d822
MAINT: add class loggers for remainining Calculators
xiaochendu Jul 1, 2024
1d805f1
MAINT: make method signature clearer
xiaochendu Jul 1, 2024
b48ace7
ENH: make setup_logger clear existing loggers
xiaochendu Jul 1, 2024
09d0e8c
ENH: generalize `sample_surface_PaiNN.py` to `sample_surface.py`
xiaochendu Jul 1, 2024
7c8712a
DEV: update gitignore with more comprehensive list
xiaochendu Jul 2, 2024
d282be3
MAINT: rename mcmc.mcmc_run to mcmc.run
xiaochendu Jul 2, 2024
42b695f
ENH: add save statistics and more elegant update default settings to …
xiaochendu Jul 2, 2024
9273044
ENH: add HydrogenPourbaixAtom to pourbaix atoms
xiaochendu Jul 3, 2024
233d8b0
Merge branch 'master' into pourbaix
xiaochendu Jul 3, 2024
6b0b408
Merge branch 'refactoring' into pourbaix
xiaochendu Jul 3, 2024
4888445
MAINT: update methods and classes with new `generate_pourbaix_atoms` …
xiaochendu Jul 3, 2024
e5a0968
ENH: further integration of script helper/plotting functions into utils
xiaochendu Jul 3, 2024
2198f71
MAINT: update `clustering.py` with code refactor
xiaochendu Jul 3, 2024
aaa43c0
ENH: add `create_surface_formation_entries.py` script
xiaochendu Jul 3, 2024
58fea50
MAINT: refactor data/resource paths and merge changes from `pourbaix`…
xiaochendu Jul 6, 2024
41f1402
MAINT & DOC: revamp scripts to follow new standard
xiaochendu Jul 6, 2024
d1eae4b
BLD: Update dependencies and environment configuration
xiaochendu Jul 6, 2024
6fe5edf
BUG: recommit gitignored tutorials data files
xiaochendu Jul 7, 2024
b4ac6f6
TST: update data path for test_filter_distance
xiaochendu Jul 7, 2024
f74eb63
MAINT: run tutorials notebooks (after commiting from offline)
xiaochendu Jul 7, 2024
b5ab922
MAINT & BUG: refactor util functions out of scripts and squash minor …
xiaochendu Jul 7, 2024
f61a389
ENH: add tutorial to cut surfaces
xiaochendu Jul 7, 2024
eaeb59a
DOC: update README and add cover image
xiaochendu Jul 7, 2024
cee8cbe
BUG: minor bugs fixed
xiaochendu Jul 7, 2024
a161dcf
BLD: update dependencies in `pyproject.toml`
xiaochendu Jul 7, 2024
17080da
DOC: update `README.md` for internal testing
xiaochendu Jul 7, 2024
68a9ba0
MAINT & BUG: update latent space clustering script and tutorial
xiaochendu Jul 7, 2024
98396b3
MAINT: remove legend for `plot_dendrogram`
xiaochendu Jul 7, 2024
784a3a6
MAINT: remove `sample_bulk.py` for publishing
xiaochendu Jul 8, 2024
b6eb939
DOC: correction to README.md
xiaochendu Jul 8, 2024
ac73171
MAINT: remove line from SrTiO3 tutorial
xiaochendu Jul 8, 2024
cec57ad
MAINT: pass logger to helper functions in `clustering.py`
xiaochendu Jul 9, 2024
1aa6119
DOC: update tutorials and files
xiaochendu Jul 9, 2024
cef5786
DOC: update latent space clustering in `README` with ensemble PaiNN NFF
xiaochendu Jul 9, 2024
b840097
DOC: Add `prepare_surface.ipynb` to `README`
xiaochendu Jul 9, 2024
ae45bdd
BLD: update environment files
xiaochendu Jul 10, 2024
677f6b5
MAINT: fix docs and minor changes
xiaochendu Jul 10, 2024
c30770f
MAINT: add correct LAMMPS potentials to repo and correct paths
xiaochendu Jul 10, 2024
0ee0dda
BUG: add `colors.txt` for `plot_settings.py`
xiaochendu Jul 18, 2024
ad50039
BLD: add `scipy` to dependency list
xiaochendu Jul 18, 2024
820ef89
MAINT: update `README` with correct badges and remove dev sections
xiaochendu Jul 18, 2024
1d7c049
BLD: add nff to dependencies
xiaochendu Jul 18, 2024
cdb65fa
MAINT: minor updates to tutorials
xiaochendu Jul 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 85 additions & 10 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,18 +1,93 @@
# Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.pyc
*.so

# Packages #
############
# it's better to unpack these files and commit the raw source
# git has its own built in compression methods
*.7z
*.dmg
*.gz
*.iso
*.jar
*.rar
#*.tar
*.zip

# Logs and databases #
######################
*.log
*.sql
*.sqlite

# OS generated files #
######################
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# python/jupyter generated files
.ipynb_checkpoints
__pycache__
*.egg-info
.vscode
.pytest_cache
*runs*
*__pycache__*
*.pyc

# data files #
*.dat
*.data
*.xyz
*.pdb
*.csv
*.pkl
*.txt
*.mpg
*.traj
*.pickle
*.cif
*.in
*.out
*.data
*.png
*.lammps
__pycache__/
*tmp_files/*
*.png

# slurm output files
slurm*.out

# directory
log/
debug/
sandbox*/
backup
dist/
sandbox_excited/
build/
*runs*
tmp_files/

# test files should still be committed
!tests/data/*/*.pkl
!tests/data/*/*.cif

# tutorial files should still be committed
!tutorials/data/*/*.pkl
!tutorials/data/*/*.cif
!tutorials/data/*/*.txt
!tutorials/*/*.txt
!tutorials/data/*/**/*.csv

# static files should still be committed
!site/static/**.png

# test file should still be committed
!tests/resources/*.pkl
!tests/resources/*.cif
# util files should still be committed
!mcmc/utils/data/colors.txt
2 changes: 1 addition & 1 deletion .isort.cfg
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
[settings]
known_third_party =ase,catkit,lammps,matplotlib,nff,numpy,pytest,scipy
known_third_party =ase,catkit,lammps,matplotlib,monty,nff,numpy,pandas,pymatgen,pytest,scipy,seaborn,sklearn,torch,tqdm,typing_extensions
26 changes: 9 additions & 17 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,13 @@ repos:
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/asottile/seed-isort-config
rev: v2.2.0
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.2
hooks:
- id: seed-isort-config
- repo: https://github.com/pre-commit/mirrors-isort
rev: v5.10.1
hooks:
- id: isort
args: ["--profile", "black"]
- repo: https://github.com/ambv/black
rev: 22.3.0
hooks:
- id: black
language_version: python3.8
# - repo: https://github.com/PyCQA/flake8
# rev: 4.0.1
# hooks:
# - id: flake8
- id: ruff
types_or: [ python, pyi, jupyter ]
args: [ --fix ]
exclude: migrations/
- id: ruff-format
types_or: [ python, pyi, jupyter ]
exclude: migrations/
159 changes: 109 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,148 @@
# Virtual Surface Site Relaxation-Monte Carlo (VSSR-MC)
[![Tests](https://github.com/learningmatter-mit/surface-sampling/actions/workflows/tests.yml/badge.svg)](https://github.com/learningmatter-mit/surface-sampling/actions/workflows/tests.yml)
[![arXiv](https://img.shields.io/badge/arXiv-2305.07251-blue?logo=arXiv&logoColor=white&logoSize=auto)](https://arxiv.org/abs/2305.07251)
[![Zenodo](https://img.shields.io/badge/data-10.5281/zenodo.7758174-14b8a6?logo=zenodo&logoColor=white&logoSize=auto)](https://zenodo.org/doi/10.5281/zenodo.7758174)

## Contents
- [Overview](#overview)
- [System requirements](#system-requirements)
- [Setup](#setup)
- [Demo](#demo)
- [Scripts](#scripts)
- [Citation](#citation)
- [Development & Bugs](#development--bugs)


# Overview
This is the VSSR-MC algorithm for sampling surface reconstructions. VSSR-MC samples across both compositional and configurational spaces. It can interface with both a neural network potential (through [ASE](https://wiki.fysik.dtu.dk/ase/)) or a classical potential (through ASE or [LAMMPS](https://www.lammps.org/)). It is a key component of the Automatic Surface Reconstruction (AutoSurfRecon) pipeline described in the following work: [Machine-learning-accelerated simulations to enable automatic surface reconstruction](https://doi.org/10.1038/s43588-023-00571-7).

This is the VSSR-MC algorithm for sampling surface reconstructions. VSSR-MC samples across both compositional and configurational spaces. It can interface with both a neural network potential (through ASE) or a classical potential (through ASE or LAMMPS). It is a key component of the Automatic Surface Reconstruction (AutoSurfRecon) pipeline described in the following work:

"Machine-learning-accelerated simulations to enable automatic surface reconstruction", by X. Du, J.K. Damewood, J.R. Lunger, R. Millan, B. Yildiz, L. Li, and R. Gómez-Bombarelli. https://doi.org/10.1038/s43588-023-00571-7

Please cite us if you find this work useful. Let us know in `issues` if you encounter any problems or have any questions.

To start, run `git clone [email protected]:learningmatter-mit/surface-sampling.git` to your local directory or a workstation.

Read through the following in order before running our code.
![Cover image](site/static/vssr_cover_image.png)

# System requirements

## Hardware requirements
We recommend a computer with the following specs:

- RAM: 16+ GB
- CPU: 4+ cores, 3 GHz/core

We tested out the code on machines with 6+ CPU cores @ 3.0+ GHz/core with 64+ GB of RAM.
To run with a neural network force field, a GPU is recommended. We ran on a single NVIDIA GeForce RTX 2080 Ti 11 GB GPU. The code has been tested on *Linux* Ubuntu 20.04.6 LTS but we expect it to work on other *Linux* distributions.

# Setup
To start, run `git clone [email protected]:learningmatter-mit/surface-sampling.git` to your local directory or a workstation.

## Conda environment
We recommend creating a new [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html) environment. Following that, the Python dependencies for the code can be installed. In the `surface-sampling` directory, run the following commands:
```bash
conda create -n vssr-mc python=3.11
conda activate vssr-mc
conda install -c conda-forge kimpy lammps openkim-models
pip install -e .
```
> If you're intending to contribute to the code, you can `pip install -e '.[dev]'` instead to also install the development dependencies.

To run with a neural network force field, a GPU is recommended. We ran on a single NVIDIA GeForce RTX 2080 Ti 11 GB GPU.
To run with LAMMPS, add the following to `~/.bashrc` or equivalent with appropriate paths and then `source ~/.bashrc`. `conda` would have installed LAMMPS as a dependency.
```bash
export LAMMPS_COMMAND="/path/to/lammps/src/lmp"
export LAMMPS_POTENTIALS="/path/to/lammps/potentials/"
export ASE_LAMMPSRUN_COMMAND="$LAMMPS_COMMAND"
```

## Software requirements
The code has been tested up to commit `02820d339eed6291b6af6ccb809f154ad6244110` on the `master` branch.
The `LAMMPS_COMMAND` should point to the LAMMPS executable, which can be found here: `/path/to/[vssr-mc-env]/bin/lmp`.
The `LAMMPS_POTENTIALS` directory should contain the LAMMPS potential files, which can found here: `/path/to/[surface-sampling-repo]/mcmc/potentials/`.
The `ASE_LAMMPSRUN_COMMAND` should point to the same LAMMPS executable. More information can be found here: [ASE LAMMPS](https://wiki.fysik.dtu.dk/ase/ase/calculators/lammpsrun.html).

### Operating system
This package has been tested on *Linux* Ubuntu 20.04.6 LTS but we expect it to be agnostic to the *Linux* system version.
If the `conda` installed LAMMPS does not work, you might have to install LAMMPS from source. More information can be found here: [LAMMPS](https://lammps.sandia.gov/doc/Build.html).

### Conda environment
[Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html) is required. Either Miniconda or Anaconda should be installed.
You might have to re-open/re-login to your terminal shell for the new settings to take effect.

Following that, the Python dependencies for the code can be installed with the following command
# Demo
A toy demo and other examples can be found in the `tutorials/` folder.
```
conda env create -f environment.yml
tutorials/
├── example.ipynb
├── GaN_0001.ipynb
├── Si_111_5x5.ipynb
├── SrTiO3_001.ipynb
├── latent_space_clustering.ipynb
└── tutorials/prepare_surface.ipynb
```
More data/examples can be found in our [Zenodo dataset](https://doi.org/10.5281/zenodo.7758174).

Installation might take 10-20 minutes to resolve dependencies.
## Toy example of Cu(100)
A toy example to illustrate the use of VSSR-MC. It should only take about a few seconds to run. Refer to `tutorials/example.ipynb`.

### Additional software
1. [LAMMPS](https://docs.lammps.org/Install.html) for classical force field optimization
2. [NFF](https://github.com/learningmatter-mit/NeuralForceField) for neural network force field
## GaN(0001) surface sampling with Tersoff potential
This example could take a few minutes to run. Refer to `tutorials/GaN_0001.ipynb`.

# Setup
Assuming you have cloned our `surface-sampling` repo to `/path/to/surface-sampling`.
## Si(111) 5x5 surface sampling with modified Stillinger–Weber potential
This example could take a few minutes to run. Refer to `tutorials/Si_111_5x5.ipynb`.

Add the following to `~/.bashrc` or equivalent with appropriate paths and then `source ~/.bashrc`.
```
export SURFSAMPLINGDIR="/path/to/surface-sampling"
export PYTHONPATH="$SURFSAMPLINGDIR:$PYTHONPATH"
## SrTiO3(001) surface sampling with machine learning potential
Demonstrates the integration of VSSR-MC with a neural network force field. This example could take a few minutes to run. Refer to `tutorials/SrTiO3_001.ipynb`.

export LAMMPS_COMMAND="/path/to/lammps/src/lmp_serial"
export LAMMPS_POTENTIALS="/path/to/lammps/potentials/"
export ASE_LAMMPSRUN_COMMAND="$LAMMPS_COMMAND"
## Clustering MC-sampled surfaces in the latent space
Retrieves the neural network embeddings of VSSR-MC structures and performs clustering. This example should only take a minute to run. Refer to `tutorials/latent_space_clustering.ipynb`.

export NFFDIR="/path/to/NeuralForceField"
export PYTHONPATH=$NFFDIR:$PYTHONPATH
```
## Preparing surface from a bulk structure
This example demonstrates how to cut a surface from a bulk structure. Refer to `tutorials/prepare_surface.ipynb`.

You might have to re-open/re-login to your shell for the new settings to take effect.

# Demo
# Scripts
Scripts can be found in the `scripts/` folder, including:
```
scripts/
├── sample_surface.py
└── clustering.py
```

A toy demo and other examples can be found in the `tutorials/` folder. More data/examples can be found in our Zenodo dataset (https://doi.org/10.5281/zenodo.7758174).
The arguments for the scripts can be found by running `python scripts/sample_surface.py -h` or `python scripts/clustering.py -h`.

## Example usage:
### Original VSSR-MC with PaiNN model trained on SrTiO3(001) surfaces
```bash
python scripts/sample_surface.py --run_name "SrTiO3_001_painn" \
--starting_structure_path "tutorials/data/SrTiO3_001/SrTiO3_001_2x2_pristine_slab.pkl" \
--model_type "PaiNN" --model_paths "tutorials/data/SrTiO3_001/nff/model01/best_model" \
"tutorials/data/SrTiO3_001/nff/model02/best_model" \
"tutorials/data/SrTiO3_001/nff/model03/best_model" \
--settings_path "scripts/configs/sample_config_painn.json"
```

### Toy example of Cu(100)
A toy example to illustrate the use of VSSR-MC. It should only take about a minute to run. Refer to `tutorials/example.ipynb`.
### Pre-trained "foundational" CHGNet model on SrTiO3(001) surfaces
```bash
python scripts/sample_surface.py --run_name "SrTiO3_001_chgnet" \
--starting_structure_path "tutorials/data/SrTiO3_001/SrTiO3_001_2x2_pristine_slab.pkl" \
--model_type "CHGNetNFF" --settings_path "scripts/configs/sample_config_chgnet.json"
```

### GaN(0001) surface sampling with Tersoff potential
We explicitly generate surface sites using `pymatgen`. This example could take 5 minutes or more to run. Refer to `tutorials/GaN_0001.ipynb`.
### Latent space clustering
```bash
python scripts/clustering.py --file_paths "tutorials/data/SrTiO3_001/SrTiO3_001_2x2_mcmc_structures_100.pkl" \
--save_folder "SrTiO3_001/clustering" --nff_model_type "PaiNN" \
--nff_paths "tutorials/data/SrTiO3_001/nff/model01/best_model" \
"tutorials/data/SrTiO3_001/nff/model02/best_model" \
"tutorials/data/SrTiO3_001/nff/model03/best_model" \
--clustering_metric "force_std" --cutoff_criterion "distance" \
--clustering_cutoff 0.2 --nff_device "cuda"
```

### Si(111) 5x5 surface sampling with modified Stillinger–Weber potential
We explicitly generate surface sites using `pymatgen`. This example could take 5 minutes or more to run. Refer to `tutorials/Si_111_5x5.ipynb`.

### SrTiO3(001) surface sampling with machine learning potential
Demonstrates the integration of VSSR-MC with a neural network force field. This example could take 10 minutes or more to run. Refer to `tutorials/SrTiO3_001.ipynb`.
# Citation
```bib
@article{duMachinelearningacceleratedSimulationsEnable2023,
title = {Machine-Learning-Accelerated Simulations to Enable Automatic Surface Reconstruction},
author = {Du, Xiaochen and Damewood, James K. and Lunger, Jaclyn R. and Millan, Reisel and Yildiz, Bilge and Li, Lin and {G{\'o}mez-Bombarelli}, Rafael},
year = {2023},
month = dec,
journal = {Nature Computational Science},
pages = {1--11},
publisher = {Nature Publishing Group},
issn = {2662-8457},
doi = {10.1038/s43588-023-00571-7},
urldate = {2023-12-07},
keywords = {Computational methods,Computational science,Software,Surface chemistry}
}
```

### Clustering MC-sampled surfaces in the latent space
Retrieving the neural network embeddings of VSSR-MC structures and performing clustering. This example should only take a minute to run. Refer to `tutorials/latent_space_clustering.ipynb`.
# Development & Bugs
VSSR-MC is under active development, if you encounter any bugs in installation and usage,
please open an [issue](https://github.com/learningmatter-mit/surface-sampling/issues). We appreciate your contributions!
27 changes: 27 additions & 0 deletions citation.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
cff-version: 1.2.0
message: If you use this software, please cite it as below.
title: Machine-Learning-Accelerated Simulations to Enable Automatic Surface Reconstruction
authors:
- family-names: Du
given-names: Xiaochen
- family-names: Damewood
given-names: James K.
- family-names: Lunger
given-names: Jaclyn R.
- family-names: Millan
given-names: Reisel
- family-names: Yildiz
given-names: Bilge
- family-names: Li
given-names: Lin
- family-names: {G{\'o}mez-Bombarelli}
given-names: Rafael
date-released: 2023-11-08
repository-code: https://github.com/learningmatter-mit/surface-sampling
arxiv: https://arxiv.org/abs/2305.07251
doi: 10.1038/s43588-023-00571-7
type: software
keywords:
[monte carlo, neural network, force field, active learning]
version: 0.1.0 # replace with the version you use
journal: Nature Computational Science
26 changes: 3 additions & 23 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,28 +1,8 @@
name: surface_sampling
name:
- vssr-mc
channels:
- conda-forge
- pytorch
- nvidia
- defaults
dependencies:
- flake8
- python=3.8
- pytorch=2.0
- pytorch-cuda=11.7
- matplotlib
- numpy>=1.21.6,<=1.22.4
- pandas
- pre-commit
- pylint
- ipykernel
- notebook
- ase
- pymatgen=2023.5.10
- rdkit
- e3fp
- scikit-learn
- lammps
- kimpy
- openkim-models
- pip
- pip:
- git+https://github.com/SUNCAT-Center/CatKit.git
Loading
Loading