Skip to content

Commit

Permalink
deploy: 58f2726
Browse files Browse the repository at this point in the history
  • Loading branch information
lbluque committed May 16, 2024
1 parent c2cc719 commit 596d804
Show file tree
Hide file tree
Showing 279 changed files with 5,688 additions and 2,840 deletions.
109 changes: 53 additions & 56 deletions _downloads/5fdddbed2260616231dbf7b0d94bb665/train.txt

Large diffs are not rendered by default.

53 changes: 25 additions & 28 deletions _downloads/819e10305ddd6839cd7da05935b17060/mass-inference.txt
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
2024-05-15 22:09:46 (INFO): Project root: /home/runner/work/fairchem/fairchem/src/fairchem
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling.
warnings.warn(
2024-05-15 22:09:47 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
2024-05-15 22:09:47 (INFO): amp: true
2024-05-16 01:16:48 (INFO): Project root: /home/runner/work/fairchem/fairchem/src/fairchem
2024-05-16 01:16:49 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
2024-05-16 01:16:49 (INFO): amp: true
cmd:
checkpoint_dir: ./checkpoints/2024-05-15-22-09-04
commit: d0f61fa
checkpoint_dir: ./checkpoints/2024-05-16-01-16-48
commit: 58f2726
identifier: ''
logs_dir: ./logs/tensorboard/2024-05-15-22-09-04
logs_dir: ./logs/tensorboard/2024-05-16-01-16-48
print_every: 10
results_dir: ./results/2024-05-15-22-09-04
results_dir: ./results/2024-05-16-01-16-48
seed: 0
timestamp_id: 2024-05-15-22-09-04
timestamp_id: 2024-05-16-01-16-48
version: 0.1.dev1+g58f2726
dataset:
a2g_args:
r_energy: false
Expand Down Expand Up @@ -122,25 +121,23 @@ test_dataset:
trainer: ocp
val_dataset: null

2024-05-15 22:09:47 (INFO): Loading dataset: ase_db
2024-05-15 22:09:47 (INFO): rank: 0: Sampler created...
2024-05-15 22:09:47 (INFO): Batch balancing is disabled for single GPU training.
2024-05-15 22:09:47 (INFO): rank: 0: Sampler created...
2024-05-15 22:09:47 (INFO): Batch balancing is disabled for single GPU training.
2024-05-15 22:09:47 (INFO): Loading model: gemnet_t
2024-05-15 22:09:48 (INFO): Loaded GemNetT with 31671825 parameters.
2024-05-15 22:09:48 (WARNING): Model gradient logging to tensorboard not yet supported.
2024-05-15 22:09:49 (INFO): Loading checkpoint from: /tmp/ocp_checkpoints/gndt_oc22_all_s2ef.pt
2024-05-15 22:09:49 (INFO): Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint.
2024-05-15 22:09:49 (WARNING): Scale factor comment not found in model
2024-05-15 22:09:49 (INFO): Predicting on test.
2024-05-16 01:16:49 (INFO): Loading dataset: ase_db
2024-05-16 01:16:49 (INFO): rank: 0: Sampler created...
2024-05-16 01:16:49 (INFO): Batch balancing is disabled for single GPU training.
2024-05-16 01:16:49 (INFO): rank: 0: Sampler created...
2024-05-16 01:16:49 (INFO): Batch balancing is disabled for single GPU training.
2024-05-16 01:16:49 (INFO): Loading model: gemnet_t
2024-05-16 01:16:51 (INFO): Loaded GemNetT with 31671825 parameters.
2024-05-16 01:16:51 (WARNING): Model gradient logging to tensorboard not yet supported.
2024-05-16 01:16:51 (INFO): Loading checkpoint from: /tmp/ocp_checkpoints/gndt_oc22_all_s2ef.pt
2024-05-16 01:16:51 (INFO): Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint.
2024-05-16 01:16:51 (WARNING): Scale factor comment not found in model
2024-05-16 01:16:51 (INFO): Predicting on test.
device 0: 0%| | 0/3 [00:00<?, ?it/s]/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/amp/autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
warnings.warn(
device 0: 33%|███████████▋ | 1/3 [00:03<00:06, 3.39s/it]device 0: 67%|███████████████████████▎ | 2/3 [00:05<00:02, 2.75s/it]device 0: 100%|███████████████████████████████████| 3/3 [00:06<00:00, 1.97s/it]device 0: 100%|███████████████████████████████████| 3/3 [00:06<00:00, 2.25s/it]
2024-05-15 22:09:56 (INFO): Writing results to ./results/2024-05-15-22-09-04/ocp_predictions.npz
2024-05-15 22:09:56 (INFO): Total time taken: 6.8953797817230225
Elapsed time = 13.0 seconds
device 0: 33%|███████████▋ | 1/3 [00:03<00:06, 3.32s/it]device 0: 67%|███████████████████████▎ | 2/3 [00:06<00:03, 3.17s/it]device 0: 100%|███████████████████████████████████| 3/3 [00:06<00:00, 1.99s/it]device 0: 100%|███████████████████████████████████| 3/3 [00:07<00:00, 2.33s/it]
2024-05-16 01:16:58 (INFO): Writing results to ./results/2024-05-16-01-16-48/ocp_predictions.npz
2024-05-16 01:16:58 (INFO): Total time taken: 7.149884939193726
Elapsed time = 13.2 seconds
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Diff not rendered.
Diff not rendered.
Diff not rendered.
Diff not rendered.
38 changes: 38 additions & 0 deletions _sources/core/datasets/oc20dense.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@

# Open Catalyst 2020 Dense (OC20Dense)

## Overview
The OC20Dense dataset is a validation dataset which was used to assess model performance in [AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials](https://arxiv.org/abs/2211.16486). OC20-Dense contains a dense sampling of adsorbate configurations on ~1,000 randomly selected adsorbate+surface materials from the [OC20](https://arxiv.org/abs/2010.09990) dataset. It comprises a total of 85,658 unique input configurations. This dataset, and the paper written for it, supports the determination of global minimum adsorbate-surface energies (the adsorption energy). This differs from OC20, which contains local adsorbate relaxations. Under low coverage conditions, the global minimum energy site is the most likely to be occupied. For computational catalysis research, we correlate the adsorption energy with important figures of merit, so aquisition of it is an important task.

## File Contents and Download
|Splits |Size of compressed version (in bytes) |Size of uncompressed version (in bytes) | MD5 checksum (download link) |
|--- |--- |--- |--- |
|LMDB |654M |9.8G | [0163b0e8c4df6d9c426b875a28d9178a](https://dl.fbaipublicfiles.com/opencatalystproject/data/adsorbml/oc20_dense_data.tar.gz) |
|ASE Trajectories |29G |112G | [ee937e5290f8f720c914dc9a56e0281f](https://dl.fbaipublicfiles.com/opencatalystproject/data/adsorbml/oc20_dense_trajectories.tar.gz) |

The following files are also provided to be used for evaluation and general information:
* `oc20dense_mapping.pkl` : Mapping of the LMDB `sid` to general metadata information -
* `system_id`: Unique system identifier for an adsorbate, bulk, surface combination.
* `config_id`: Unique configuration identifier, where `rand` and `heur` correspond to random and heuristic initial configurations, respectively.
* `mpid`: Materials Project bulk identifier.
* `miller_idx`: 3-tuple of integers indicating the Miller indices of the surface.
* `shift`: C-direction shift used to determine cutoff for the surface (c-direction is following the nomenclature from Pymatgen).
* `top`: Boolean indicating whether the chosen surface was at the top or bottom of the originally enumerated surface.
* `adsorbate`: Chemical composition of the adsorbate.
* `adsorption_site`: A tuple of 3-tuples containing the Cartesian coordinates of each binding adsorbate atom
* `oc20dense_targets.pkl` : DFT adsorption energies across different system and placement ids.
* `oc20dense_compute.pkl` : DFT compute as measured in the number of ionic and scf steps for each evaluated relaxation.
* `oc20dense_ref_energies.pkl` : Reference energy used for a specified `system_id`. This energy includes the relaxed clean surface and the gas phase adsorbate energy to ensure consistency across calculations.
* `oc20dense_tags.pkl` : Tag information used for a specified `system_id`. Where 0 = subsurface, 1 = surface, 2 = adsorbate.

All mappings can be obtained at the following downloadable link: https://dl.fbaipublicfiles.com/opencatalystproject/data/adsorbml/oc20_dense_mappings.tar.gz

MD5 checksums:
```
c18735c405ce6ce5761432b07287d8d9 oc20_dense_mappings.tar.gz
3e26c3bcef01ccfc9b001931065ea6e6 oc20dense_mapping.pkl
fd589b013b72e62e11a6b2a5bd1d323c oc20dense_targets.pkl
78d25997e0aaf754df526ab37276bb89 oc20dense_compute.pkl
b07c64158e4bfa5f7b9bf6263753ecc5 oc20dense_ref_energies.pkl
1ba0bc266130f186850f5faa547b6a02 oc20dense_tags.pkl
```
62 changes: 62 additions & 0 deletions _sources/core/datasets/oc20neb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@

# Open Catalyst 2020 Nudged Elastic Band (OC20NEB)

## Overview
This is a validation dataset which was used to assess model performance in [CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks](https://arxiv.org/abs/2405.02078). It is comprised of 932 NEB relaxation trajectories. There are three different types of reactions represented: desorptions, dissociations, and transfers. NEB calculations allow us to find transition states. The rate of reaction is determined by the transition state energy, so access to transition states is very important for catalysis research. For more information, check out the paper.

## File Structure and Contents
The tar file contains 3 subdirectories: dissociations, desorptions, and transfers. As the names imply, these directories contain the converged DFT trajectories for each of the reaction classes. Within these directories, the trajectories are named to identify the contents of the file. Here is an example and the anatomy of the name:

```desorption_id_83_2409_9_111-4_neb1.0.traj```

1. `desorption` indicates the reaction type (dissociation and transfer are the other possibilities)
2. `id` identifies that the material belongs to the validation in domain split (ood - out of domain is th e other possibility)
3. `83` is the task id. This does not provide relavent information
4. `2409` is the bulk index of the bulk used in the ocdata bulk pickle file
5. `9` is the reaction index. for each reaction type there is a reaction pickle file in the repository. In this case it is the 9th entry to that pickle file
6. `111-4` the first 3 numbers are the miller indices (i.e. the (1,1,1) surface), and the last number cooresponds to the shift value. In this case the 4th shift enumerated was the one used.
7. `neb1.0` the number here indicates the k value used. For the full dataset, 1.0 was used so this does not distiguish any of the trajectories from one another.


The content of these trajectory files is the repeating frame sets. Despite the initial and final frames not being optimized during the NEB, the initial and final frames are saved for every iteration in the trajectory. For the dataset, 10 frames were used - 8 which were optimized over the neb. So the length of the trajectory is the number of iterations (N) * 10. If you wanted to look at the frame set prior to optimization and the optimized frame set, you could get them like this:

```python
from ase.io import read

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
unrelaxed_frames = traj[0:10]
relaxed_frames = traj[-10:]
```

## Download
|Splits |Size of compressed version (in bytes) |Size of uncompressed version (in bytes) | MD5 checksum (download link) |
|--- |--- |--- |--- |
|ASE Trajectories |1.5G |6.3G | [52af34a93758c82fae951e52af445089](https://dl.fbaipublicfiles.com/opencatalystproject/data/oc20neb/oc20neb_dft_trajectories_04_23_24.tar.gz) |



## Use
One more note: We have not prepared an lmdb for this dataset. This is because it is NEB calculations are not supported directly in ocp. You must use the ase native OCP class along with ase infrastructure to run NEB calculations. Here is an example of a use:

```python
from ase.io import read
from ase.optimize import BFGS
from fairchem.applications.cattsunami.core import OCPNEB

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
neb_frames = traj[0:10]
neb = OCPNEB(
neb_frames,
checkpoint_path=YOUR_CHECKPOINT_PATH,
k=k,
batch_size=8,
)
optimizer = BFGS(
neb,
trajectory=f"test_neb.traj",
)
conv = optimizer.run(fmax=0.45, steps=200)
if conv:
neb.climb = True
conv = optimizer.run(fmax=0.05, steps=300)
```
49 changes: 27 additions & 22 deletions _sources/core/install.md
Original file line number Diff line number Diff line change
@@ -1,66 +1,71 @@
# Installation

## conda or better yet [mamba](https://mamba.readthedocs.io/en/latest/user_guide/mamba.html) - easy
### conda or better yet [mamba](https://mamba.readthedocs.io/en/latest/user_guide/mamba.html) - easy

We do not have official conda recipes (yet!), so to install with conda or mamba you will need to clone the
[fairchem](https://github.com/FAIR-Chem/fairchem) and run the following from inside the repo directory to create an environment with all the
necessary dependencies.
We do not have official conda recipes (yet!); in the meantime you can use the
following environment yaml files for CPU [env.cpu.yml](https://raw.githubusercontent.com/FAIR-Chem/fairchem/main/packages/env.cpu.yml)
and GPU [env.gpu.yml](https://raw.githubusercontent.com/FAIR-Chem/fairchem/main/packages/env.gpu.yml) to easily set up a
working environment and install `fairchem-core`.

1. Create a *fairchem* environment
1. Create an environment to install *fairchem*
1. **GPU**

The default environment uses cuda 11.8, if you need a different version you will have to edit *pytorch-cuda* version
accordingly.
```bash
conda env create -f packages/env.gpu.yml
conda env create -f env.gpu.yml
```

2. **CPU**
```bash
conda env create -f packages/env.cpu.yml
conda env create -f env.cpu.yml
```

2. Activate the environment and install `fairchem-core`
2. Activate the environment and install `fairchem-core` from PyPi
```bash
conda activate fair-chem
pip install packages/fairchem-core
pip install fairchem-core
```

## PyPi - flexible
### PyPi - flexible
You can also install `pytorch` and `torch_geometric` dependencies from PyPI to select specific CPU or CUDA versions.

1. Install `pytorch` by selecting your installer, OS and CPU or CUDA version following the official
[Pytorch docs](https://pytorch.org/get-started/locally/)

2. Install `torch_geometric` and the `torch_scatter`, `torch_sparse`, and `torch_cluster` optional dependencies
similarly by selecting the appropriate versions in the official
[PyG docs](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)

3. Install `fairchem-core`
1. From test-PyPi (until we have our official release on PyPi soon!)
```bash
pip install -i https://test.pypi.org/simple/fairchem-core
```
2. Or by cloning the repo and then using pip
```bash
pip install packages/fairchem-core
```
3. Install `fairchem-core` from PyPi
```bash
pip install -i fairchem-core
```


## Additional packages

`fairchem` is a namespace package, meaning all packages are installed seperately. If you need
to install other packages you can do so by:
```bash
pip install -e pip install packages/fairchem-{package-to-install}
pip install fairchem-{package-to-install}
```

## Dev install

If you plan to make contributions you will need to clone (for windows user please see next section) the repo and install `fairchem-core` in editable mode with dev
If you plan to make contributions you will need to clone (for windows user please see next section) the repo and install
`fairchem-core` in editable mode with dev
dependencies,
```bash
pip install -e pip install packages/fairchem-core[dev]
```

## Cloning git repository on windows
And similarly for any other namespace package:
```bash
pip install packages/fairchem-{package-to-install}
```

### Cloning and installing the git repository on windows

Our build system requires the use of symlinks which are not available by default on windows. To properly build fairchem packages you must enable symlinks and clone the repository with them enabled.

Expand Down
2 changes: 2 additions & 0 deletions _sources/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ tasks, data, and metrics, please read the documentations and respective papers:
- [OC20](core/datasets/oc20)
- [OC22](core/datasets/oc22)
- [ODAC23](core/datasets/odac)
- [OC20Dense](core/datasets/oc20dense)
- [OC20NEB](core/datasets/oc20neb)

### Projects and models built on `fairchem`:

Expand Down
5 changes: 1 addition & 4 deletions _sources/tutorials/cattsunami_walkthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,7 @@ for config in product2_configs:

## Enumerate NEBs
Here we use the class we created to handle automatic generation of NEB frames to create frames using the structures we just relaxed as input.

```{code-cell} ipython3
Image(filename="dissociation_scheme.png")
```
![dissociation_scheme](https://github.com/FAIR-Chem/fairchem/blob/main/src/fairchem/applications/cattsunami/tutorial/dissociation_scheme.png)

```{code-cell} ipython3
af = AutoFrameDissociation(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,8 @@

<li class="toctree-l1"><a class="reference internal" href="../../../../core/datasets/oc22.html">Open Catalyst 2022 (OC22)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../core/datasets/odac.html">Open Direct Air Capture 2023 (ODAC23)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../core/datasets/oc20dense.html">Open Catalyst 2020 Dense (OC20Dense)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../core/datasets/oc20neb.html">Open Catalyst 2020 Nudged Elastic Band (OC20NEB)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../core/model_checkpoints.html">Pretrained FAIRChem models</a></li>


Expand All @@ -233,8 +235,6 @@
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Catalysis Case Studies &amp; Tutorials</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="../../../../tutorials/cattsunami_walkthrough.html">CatTSunami tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../tutorials/adsorbml_walkthrough.html">AdsorbML tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../tutorials/intro.html">Intro and background on OCP and DFT</a></li>


Expand All @@ -246,6 +246,8 @@



<li class="toctree-l1"><a class="reference internal" href="../../../../tutorials/adsorbml_walkthrough.html">AdsorbML tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="../../../../tutorials/cattsunami_walkthrough.html">CatTSunami tutorial</a></li>
<li class="toctree-l1 has-children"><a class="reference internal" href="../../../../tutorials/NRR/NRR_toc.html">Screening catalysts with OCP</a><input class="toctree-checkbox" id="toctree-checkbox-1" name="toctree-checkbox-1" type="checkbox"/><label class="toctree-toggle" for="toctree-checkbox-1"><i class="fa-solid fa-chevron-down"></i></label><ul>
<li class="toctree-l2"><a class="reference internal" href="../../../../tutorials/NRR/NRR_example.html">Using OCP to enumerate adsorbates on alloy catalyst surfaces</a></li>

Expand Down
Loading

0 comments on commit 596d804

Please sign in to comment.