deploy: 962bffa

FAIR-Chem · Apr 14, 2024 · b0b4b9d · b0b4b9d
1 parent 9f4910e
commit b0b4b9d
Show file tree

Hide file tree

Showing 178 changed files with 1,800 additions and 2,151 deletions.
diff --git a/_downloads/5fdddbed2260616231dbf7b0d94bb665/train.txt b/_downloads/5fdddbed2260616231dbf7b0d94bb665/train.txt
@@ -1,17 +1,17 @@
-2024-04-14 13:46:21 (INFO): Project root: /home/runner/work/ocp/ocp
+2024-04-14 18:46:02 (INFO): Project root: /home/runner/work/ocp/ocp
 /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
   warnings.warn(
-2024-04-14 13:46:23 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
-2024-04-14 13:46:23 (INFO): amp: true
+2024-04-14 18:46:03 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
+2024-04-14 18:46:03 (INFO): amp: true
 cmd:
-  checkpoint_dir: fine-tuning/checkpoints/2024-04-14-13-45-36-ft-oxides
-  commit: b90c171
+  checkpoint_dir: fine-tuning/checkpoints/2024-04-14-18-46-24-ft-oxides
+  commit: 962bffa
   identifier: ft-oxides
-  logs_dir: fine-tuning/logs/wandb/2024-04-14-13-45-36-ft-oxides
+  logs_dir: fine-tuning/logs/wandb/2024-04-14-18-46-24-ft-oxides
   print_every: 10
-  results_dir: fine-tuning/results/2024-04-14-13-45-36-ft-oxides
+  results_dir: fine-tuning/results/2024-04-14-18-46-24-ft-oxides
   seed: 0
-  timestamp_id: 2024-04-14-13-45-36-ft-oxides
+  timestamp_id: 2024-04-14-18-46-24-ft-oxides
 dataset:
   a2g_args:
     r_energy: true

diff --git a/_images/039ed970e0b1e654d1c5696394765bdfbb15ef84d1d3062637148bd4f0927ff8.png b/_images/039ed970e0b1e654d1c5696394765bdfbb15ef84d1d3062637148bd4f0927ff8.png
diff --git a/_images/3c0ea376da00e3a4561845b7db6d783fe77423edb0c5a4883cf7d9dceff8d2cb.png b/_images/3c0ea376da00e3a4561845b7db6d783fe77423edb0c5a4883cf7d9dceff8d2cb.png
diff --git a/_images/42099d5cc866a7d1b4adba531db3563a2c14ea9bd734526a36fdc1963942d806.png b/_images/42099d5cc866a7d1b4adba531db3563a2c14ea9bd734526a36fdc1963942d806.png
diff --git a/_images/e1d30625e6e60fa80387151aa238daa23502da4d794234ef3a02ebfd05b700e1.png b/_images/e1d30625e6e60fa80387151aa238daa23502da4d794234ef3a02ebfd05b700e1.png
diff --git a/_sources/core/inference.md b/_sources/core/inference.md
@@ -11,7 +11,7 @@ kernelspec:
   name: python3
 ---
 
-Mass inference
+Fast batched inference
 ------------------
 
 The ASE calculator is not necessarily the most efficient way to run a lot of computations. It is better to do a "mass inference" using a command line utility. We illustrate how to do that here. 

diff --git a/_sources/videos/intro_series.md → _sources/core/intro_series.md b/_sources/videos/intro_series.md → _sources/core/intro_series.md
@@ -1,4 +1,4 @@
-# Open Catalyst Intro Series
+# Why model atoms for climate?
 
 New to chemistry but excited to know how ML can help? Larry Zitnick has made a few intro videos for audiences without a computational chemistry background!
 

diff --git a/_sources/tutorials/adsorbml_walkthrough.md b/_sources/tutorials/adsorbml_walkthrough.md
@@ -204,6 +204,9 @@ idxs_to_keep = deduplicate(configs_for_deduplication, adsorbate.binding_indices[
 ```
 
 ```{code-cell} ipython3
+---
+tags: ["skip-execution"]
+---
 # Flip through your configurations to check them out (and make sure deduplication looks good)
 print(idxs_to_keep)
 view_x3d_n(configs_for_deduplication[2].repeat((2,2,1)))
@@ -221,9 +224,12 @@ df
 
 ## Write VASP input files
 
-This assumes you have access to VASP pseudopotentials. The default VASP flags (which are equivalent to those used to make OC20) are located in `ocdata.utils.vasp`. Alternatively, you may pass your own vasp flags to the `write_vasp_input_files` function as `vasp_flags`
+This assumes you have access to VASP pseudopotentials and the right environment variables configured for ASE. The default VASP flags (which are equivalent to those used to make OC20) are located in `ocdata.utils.vasp`. Alternatively, you may pass your own vasp flags to the `write_vasp_input_files` function as `vasp_flags`. 
 
 ```{code-cell} ipython3
+---
+tags: ["skip-execution"]
+---
 # Grab the 5 systems with the lowest energy
 configs_for_dft = df.sort_values(by = "relaxed_energy_ml").relaxed_atoms.tolist()[0:5]
 config_idxs = df.sort_values(by = "relaxed_energy_ml").relaxation_idx.tolist()[0:5]

diff --git a/_sources/tutorials/advanced/embeddings.md b/_sources/tutorials/advanced/embeddings.md
@@ -20,15 +20,25 @@ OCP works by computing an *embedding*, aka a high dimensional vector representat
 
 We used them to search for similar atomic structures.
 
-We can use them for diagnostic purposes, or clustering. For GemNet-OC, we provide 5 different kinds of embeddings:
+We can use them for diagnostic purposes, or clustering.
+
+In this example, we patch the GemNetOC model to save the embeddings so you can easily access them. This requires two changes. The first is in the GemNetOC model where the embeddings are saved, and the second is in the OCPCalculator to retrieve them.
+
+We provide 5 different kinds of embeddings:
 
 1. 'h' - This is an early block in the embedding calculation. You get the h-embedding for each atom
 2. 'h sum' - This is an early block in the embedding calculation. You get the h-embedding summed over each atom
 3. 'x_E' - The atomic energy is linear in this, returned for each atom
 4. 'x_E sum' - summed over atoms
 5. 'x_F sum' - This is related to the forces
 
-In principle other models could be adapted in a similar way.
+In principle other models could be adapted in a similar way. See [embedding-monkeypatch.py](./embedding-monkeypatch.py) for details on the patch. We simply run this notebook below to load it.
+
+The OCP project is still under active development, and it is not yet clear what the best way to access these embeddings are, so this code is not yet part of the main development branch. This code was adapted from a branch at https://github.com/Open-Catalyst-Project/ocp/blob/gnoc-embeddings.
+
+```{code-cell} ipython3
+import embedding_monkeypatch
+```
 
 # A diagnostic example
 
@@ -49,15 +59,14 @@ import os
 checkpoint_path = model_name_to_local_file('GemNet-OCOC20+OC22', local_cache='/tmp/ocp_checkpoints/')
 
 calc = OCPCalculator(checkpoint_path=checkpoint_path)
-calc.trainer._unwrapped_model.return_embedding = True
-
 ```
 
 ## Bulk Cu equation of state example
 
 Here we simply compute an equation of state by varying the lattice constant. You will see a small unphysical feature near 3.7 angstroms. We will investigate why that happens.
 
 ```{code-cell} ipython3
+#calc.trainer._unwrapped_model.return_embedding = False
 a0 = 3.63
 E = []
 
@@ -79,7 +88,7 @@ plt.xlabel('Lattice constant (A)')
 plt.ylabel('Energy (eV)');
 ```
 
-Something is a little off in this equation of state, there is an unphysical bump in it. We now rerun this and get the embeddings. You simply set `calc.trainer._unwrapped_model.return_embedding = True`  and the calculator will return useful embeddings if possible (e.g. for GemNet-OC). We need a reference configuration to compare too. We choose a lattice constant of 3.63 angstroms and compute three different embeddings.
+Something is a little off in this equation of state, there is an unphysical bump in it. We now rerun this and get the embeddings. You simply call the `calc.embed` method. We need a reference configuration to compare too. We choose a lattice constant of 3.63 angstroms and compute three different embeddings.
 
 ```{code-cell} ipython3
 a0 = 3.63
@@ -92,9 +101,9 @@ atoms = atoms.repeat((2, 2, 2))
 atoms.set_tags(np.ones(len(atoms)))
 atoms.calc = calc
 
-atoms.get_potential_energy()
+out = calc.embed(atoms)
 
-x1, x2, x3 = atoms.calc.results['h sum'], atoms.calc.results['x_E sum'], atoms.calc.results['x_F sum']
+x1, x2, x3 = out['h sum'], out['x_E sum'], out['x_F sum']
 ```
 
 Next, we loop over a grid of lattice constants, and we compare the cosine similarity of the embeddings for each one to the reference embeddings above. A similarity of 1 means they are the same, and as the similarity decreases it means the embbedings are more and more different (and so is the energy).
@@ -112,12 +121,12 @@ for a in LC:
                  pbc=True)
     atoms = atoms.repeat((2, 2, 2))
     atoms.set_tags(np.ones(len(atoms)))
-    atoms.get_potential_energy()
 
+    out = calc.embed(atoms)
     
-    cossim1.append(torch.cosine_similarity(x1, atoms.calc.results["h sum"]).item())
-    cossim2.append(torch.cosine_similarity(x2, atoms.calc.results["x_E sum"]).item())
-    cossim3.append(torch.cosine_similarity(x3, atoms.calc.results["x_F sum"]).item())
+    cossim1.append(torch.cosine_similarity(x1, out["h sum"]).item())
+    cossim2.append(torch.cosine_similarity(x2, out["x_E sum"]).item())
+    cossim3.append(torch.cosine_similarity(x3, out["x_F sum"]).item())
     E += [out['energy']]
 ```
 
@@ -153,29 +162,28 @@ We use this example to show that we can cluster structures by embedding similari
 from ase.build import bulk
 from ase.cluster import Octahedron
 
+calc.trainer._unwrapped_model.return_embedding = True
+
 embeddings = []
 labels = []
 
 oct = Octahedron('Cu', 2)
 oct.set_tags(np.ones(len(oct)))
-oct.calc = calc
 
 for i in range(20):
     oct.rattle(0.01)
-    oct.get_potential_energy()
-    embeddings += [oct.calc.results['x_E sum'][0].numpy()]
+    embeddings += [calc.embed(oct)['x_E sum'][0].numpy()]
     labels += [0]
 ```
 
 ```{code-cell} ipython3
 b = bulk('Cu')
 b = b.repeat((2, 2, 2))
 b.set_tags(np.ones(len(b)))
-b.calc = calc
 
 for i in range(40):
     b.rattle(0.01)
-    embeddings += [b.calc.results['x_E sum'][0].numpy()]
+    embeddings += [calc.embed(b)['x_E sum'][0].numpy()]
     labels += [1]
 ```
 
@@ -213,28 +221,26 @@ energies = []
 
 oct = Octahedron('Cu', 2)
 oct.set_tags(np.ones(len(oct)))
-oct.calc = calc
+
 for i in range(20):
     oct.rattle(0.01)
-    oct.get_potential_energy()
-
-    for a in oct.calc.results['h'][0]:
+    out = calc.embed(oct)
+    for a in out['h'][0]:
         embeddings += [a.numpy()]
         labels += [0]
-        energies += [oct.calc.results['energy']]
+        energies += [out['energy']]
 
 b = bulk('Cu')
 b = b.repeat((2, 2, 2))
 b.set_tags(np.ones(len(b)))
 
 for i in range(20):
     b.rattle(0.01)
-    b.get_potential_energy()
-
-    for a in b.calc.results['h'][0]:
+    out = calc.embed(b)
+    for a in out['h'][0]:
         embeddings += [a.numpy()]
         labels += [1]
-        energies += [b.calc.results['energy']]
+        energies += [out['energy']]
         
 embeddings = np.array(embeddings)
 
@@ -270,35 +276,30 @@ data = vdict(space='cosine')
 
 ethanol = molecule('CH3CH2OH')
 ethanol.set_tags(np.ones(len(ethanol)))
-ethanol.calc = calc
-ethanol.get_potential_energy()
-
-for i, atom in enumerate(ethanol):
-    data[ethanol.calc.results['x_E'][0][i].numpy()] = [i, ethanol]
-    
+ethanol_emb = calc.embed(ethanol)
 
 methane = molecule('C2H6')
 methane.set_tags(np.ones(len(methane)))
-methane.calc = calc
-methane.get_potential_energy()
+methane_emb = calc.embed(methane)
 
+for i, atom in enumerate(ethanol):
+    data[ethanol_emb['x_E'][0][i].numpy()] = [i, ethanol]
+    
 for i, atom in enumerate(methane):
-    data[methane.calc.results['x_E'][0][i].numpy()] = [i, methane]
+    data[methane_emb['x_E'][0][i].numpy()] = [i, methane]
 ```
 
 Now we construct our "query". We inspect the Atoms object, see that the C atom is the first one, and then extract the embedding for that atom and save it in a variable.
 
 ```{code-cell} ipython3
 methanol = molecule('CH3OH')
 methanol.set_tags(np.ones(len(methanol)))
-methanol.calc = calc
-methanol.get_potential_energy()
-
+methanol_emb = calc.embed(methanol)
 methanol
 ```
 
 ```{code-cell} ipython3
-query = methanol.calc.results['x_E'][0][0].numpy()
+query = methanol_emb['x_E'][0][0].numpy()
 ```
 
 We run our search with the syntax like a dictionary. It returns the closest found match.

diff --git a/autoapi/index.html b/autoapi/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/data_parallel/index.html b/autoapi/ocpmodels/common/data_parallel/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/distutils/index.html b/autoapi/ocpmodels/common/distutils/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/flags/index.html b/autoapi/ocpmodels/common/flags/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/gp_utils/index.html b/autoapi/ocpmodels/common/gp_utils/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/hpo_utils/index.html b/autoapi/ocpmodels/common/hpo_utils/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/index.html b/autoapi/ocpmodels/common/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>

diff --git a/autoapi/ocpmodels/common/logger/index.html b/autoapi/ocpmodels/common/logger/index.html
@@ -216,7 +216,7 @@
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/model_training.html">Training and evaluating custom models on OCP datasets</a></li>
 
 
-<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Mass inference</a></li>
+<li class="toctree-l1"><a class="reference internal" href="../../../../core/inference.html">Fast batched inference</a></li>
 
 
 <li class="toctree-l1"><a class="reference internal" href="../../../../core/fine-tuning/fine-tuning-oxides.html">Fine tuning a model</a></li>
@@ -225,7 +225,6 @@
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Videos and Talks</span></p>
 <ul class="nav bd-sidenav">
-<li class="toctree-l1"><a class="reference internal" href="../../../../videos/intro_series.html">Open Catalyst Intro Series</a></li>
 <li class="toctree-l1"><a class="reference internal" href="../../../../videos/technical_talks.html">Technical presentations</a></li>
 </ul>
 <p aria-level="2" class="caption" role="heading"><span class="caption-text">Case Studies &amp; Tutorials</span></p>