deploy: 7dca78f

FAIR-Chem · Apr 15, 2024 · 8637ae6 · 8637ae6
1 parent 4bd0010
commit 8637ae6
Show file tree

Hide file tree

Showing 28 changed files with 1,501 additions and 2,319 deletions.
diff --git a/_downloads/5fdddbed2260616231dbf7b0d94bb665/train.txt b/_downloads/5fdddbed2260616231dbf7b0d94bb665/train.txt
@@ -1,17 +1,17 @@
-2024-04-15 00:36:16 (INFO): Project root: /home/runner/work/ocp/ocp
+2024-04-15 02:37:54 (INFO): Project root: /home/runner/work/ocp/ocp
 /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
   warnings.warn(
-2024-04-15 00:36:17 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
-2024-04-15 00:36:17 (INFO): amp: true
+2024-04-15 02:37:56 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
+2024-04-15 02:37:56 (INFO): amp: true
 cmd:
-  checkpoint_dir: fine-tuning/checkpoints/2024-04-15-00-36-16-ft-oxides
-  commit: 548b302
+  checkpoint_dir: fine-tuning/checkpoints/2024-04-15-02-37-52-ft-oxides
+  commit: 7dca78f
   identifier: ft-oxides
-  logs_dir: fine-tuning/logs/tensorboard/2024-04-15-00-36-16-ft-oxides
+  logs_dir: fine-tuning/logs/tensorboard/2024-04-15-02-37-52-ft-oxides
   print_every: 10
-  results_dir: fine-tuning/results/2024-04-15-00-36-16-ft-oxides
+  results_dir: fine-tuning/results/2024-04-15-02-37-52-ft-oxides
   seed: 0
-  timestamp_id: 2024-04-15-00-36-16-ft-oxides
+  timestamp_id: 2024-04-15-02-37-52-ft-oxides
 dataset:
   a2g_args:
     r_energy: true
@@ -111,7 +111,7 @@ optim:
   load_balancing: atoms
   loss_energy: mae
   lr_initial: 0.0005
-  max_epochs: 10
+  max_epochs: 50
   mode: min
   num_workers: 2
   optimizer: AdamW
@@ -142,33 +142,33 @@ val_dataset:
     r_forces: true
   src: val.db
 
-2024-04-15 00:36:17 (INFO): Loading dataset: ase_db
-2024-04-15 00:36:17 (INFO): rank: 0: Sampler created...
-2024-04-15 00:36:17 (INFO): Batch balancing is disabled for single GPU training.
-2024-04-15 00:36:17 (INFO): rank: 0: Sampler created...
-2024-04-15 00:36:17 (INFO): Batch balancing is disabled for single GPU training.
-2024-04-15 00:36:17 (INFO): rank: 0: Sampler created...
-2024-04-15 00:36:17 (INFO): Batch balancing is disabled for single GPU training.
-2024-04-15 00:36:17 (INFO): Loading model: gemnet_oc
-2024-04-15 00:36:17 (WARNING): Unrecognized arguments: ['symmetric_edge_symmetrization']
-2024-04-15 00:36:20 (INFO): Loaded GemNetOC with 38864438 parameters.
-2024-04-15 00:36:20 (WARNING): Model gradient logging to tensorboard not yet supported.
-2024-04-15 00:36:20 (WARNING): Using `weight_decay` from `optim` instead of `optim.optimizer_params`.Please update your config to use `optim.optimizer_params.weight_decay`.`optim.weight_decay` will soon be deprecated.
-2024-04-15 00:36:20 (INFO): Loading checkpoint from: /tmp/ocp_checkpoints/gnoc_oc22_oc20_all_s2ef.pt
-2024-04-15 00:36:20 (INFO): Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 
+2024-04-15 02:37:56 (INFO): Loading dataset: ase_db
+2024-04-15 02:37:56 (INFO): rank: 0: Sampler created...
+2024-04-15 02:37:56 (INFO): Batch balancing is disabled for single GPU training.
+2024-04-15 02:37:56 (INFO): rank: 0: Sampler created...
+2024-04-15 02:37:56 (INFO): Batch balancing is disabled for single GPU training.
+2024-04-15 02:37:56 (INFO): rank: 0: Sampler created...
+2024-04-15 02:37:56 (INFO): Batch balancing is disabled for single GPU training.
+2024-04-15 02:37:56 (INFO): Loading model: gemnet_oc
+2024-04-15 02:37:56 (WARNING): Unrecognized arguments: ['symmetric_edge_symmetrization']
+2024-04-15 02:37:58 (INFO): Loaded GemNetOC with 38864438 parameters.
+2024-04-15 02:37:58 (WARNING): Model gradient logging to tensorboard not yet supported.
+2024-04-15 02:37:58 (WARNING): Using `weight_decay` from `optim` instead of `optim.optimizer_params`.Please update your config to use `optim.optimizer_params.weight_decay`.`optim.weight_decay` will soon be deprecated.
+2024-04-15 02:37:59 (INFO): Loading checkpoint from: /tmp/ocp_checkpoints/gnoc_oc22_oc20_all_s2ef.pt
+2024-04-15 02:37:59 (INFO): Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint. 
 /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
   storage = elem.storage()._new_shared(numel)
 /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
   storage = elem.storage()._new_shared(numel)
 /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/amp/autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
   warnings.warn(
-2024-04-15 00:36:32 (INFO): Evaluating on val.
+2024-04-15 02:38:11 (INFO): Evaluating on val.
 device 0:   0%|          | 0/2 [00:00<?, ?it/s]/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
   storage = elem.storage()._new_shared(numel)
 /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
   storage = elem.storage()._new_shared(numel)
-device 0:  50%|█████     | 1/2 [00:04<00:04,  4.08s/it]device 0: 100%|██████████| 2/2 [00:06<00:00,  3.21s/it]device 0: 100%|██████████| 2/2 [00:06<00:00,  3.39s/it]
-2024-04-15 00:36:39 (INFO): energy_forces_within_threshold: 0.0000, energy_mae: 2.8244, forcesx_mae: 0.0080, forcesy_mae: 0.0105, forcesz_mae: 0.0081, forces_mae: 0.0089, forces_cosine_similarity: 0.1907, forces_magnitude_error: 0.0127, loss: 2.8302, epoch: 0.0667
+device 0:  50%|█████     | 1/2 [00:04<00:04,  4.21s/it]device 0: 100%|██████████| 2/2 [00:06<00:00,  3.22s/it]device 0: 100%|██████████| 2/2 [00:06<00:00,  3.42s/it]
+2024-04-15 02:38:18 (INFO): energy_forces_within_threshold: 0.0000, energy_mae: 2.8244, forcesx_mae: 0.0080, forcesy_mae: 0.0105, forcesz_mae: 0.0081, forces_mae: 0.0089, forces_cosine_similarity: 0.1907, forces_magnitude_error: 0.0127, loss: 2.8302, epoch: 0.0667
 Traceback (most recent call last):
   File "/home/runner/work/ocp/ocp/main.py", line 89, in <module>
     Runner()(config)

diff --git a/_images/001520453176fbf837322ea5e43e49de40d15ec153f74e3bc23f037f1237646d.png b/_images/001520453176fbf837322ea5e43e49de40d15ec153f74e3bc23f037f1237646d.png
diff --git a/_images/0317df845cf259f696bc1cc624f308af969bb3a9dbc94e75c6e34ac777dbf142.png b/_images/0317df845cf259f696bc1cc624f308af969bb3a9dbc94e75c6e34ac777dbf142.png
diff --git a/_images/40cad2054d01eadccd374dbcc8d359fc59a65fa9bd97b64af8e71963e425b60a.png b/_images/40cad2054d01eadccd374dbcc8d359fc59a65fa9bd97b64af8e71963e425b60a.png
diff --git a/_images/6d9ce0f34e58c8df3fc10b4c600ee271918424e60e4b5858e7247dfebac6a4ac.png b/_images/6d9ce0f34e58c8df3fc10b4c600ee271918424e60e4b5858e7247dfebac6a4ac.png
diff --git a/_images/7257f9248001a4dd4e85f68c5cc5a618a3f5c80988589f515a769a36af2c4cbe.png b/_images/7257f9248001a4dd4e85f68c5cc5a618a3f5c80988589f515a769a36af2c4cbe.png
diff --git a/_images/86bece59091bad9ca29d16315cecdb575100579bb095f8a9072e7975e5ba3dac.png b/_images/86bece59091bad9ca29d16315cecdb575100579bb095f8a9072e7975e5ba3dac.png
diff --git a/_images/8a99188fc993d863e9ffa947f4e0416561ab1b701471451dfa983706c2156e20.png b/_images/8a99188fc993d863e9ffa947f4e0416561ab1b701471451dfa983706c2156e20.png
diff --git a/_images/9040158be66a5ab4ceba9aa705d2f60b3fafb349f8594c96e7107757117a303f.png b/_images/9040158be66a5ab4ceba9aa705d2f60b3fafb349f8594c96e7107757117a303f.png
diff --git a/_sources/core/fine-tuning/fine-tuning-oxides.md b/_sources/core/fine-tuning/fine-tuning-oxides.md
@@ -209,7 +209,7 @@ yml = generate_yml_config(checkpoint_path, 'config.yml',
                    update={'gpus': 1,
                            'task.dataset': 'ase_db',
                            'optim.eval_every': 1,
-                           'optim.max_epochs': 10,
+                           'optim.max_epochs': 50,
                             'logger':'tensorboard', # don't use wandb!
                            # Train data
                            'dataset.train.src': 'train.db',

diff --git a/_sources/core/inference.md b/_sources/core/inference.md
@@ -35,13 +35,12 @@ Comment or skip this block to use the whole dataset!
 ! mv data.db full_data.db
 
 import ase.db
-import random
+import numpy as np
 
 with ase.db.connect('full_data.db') as full_db:
   with ase.db.connect('data.db',append=False) as subset_db:
-
     # Select 50 random points for the subset!
-    for i in random.shuffled(range(1,len(full_db)))[:50]:
+    for i in np.random.choice(range(1,len(full_db)),size=50,replace=False):
       subset_db.write(full_db.get_atoms(i))
 
 ```

diff --git a/core/fine-tuning/fine-tuning-oxides.html b/core/fine-tuning/fine-tuning-oxides.html
@@ -769,7 +769,7 @@ <h1>Fine tuning a model<a class="headerlink" href="#fine-tuning-a-model" title="
   warnings.warn(
 </pre></div>
 </div>
-<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Elapsed time 67.4 seconds.
+<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Elapsed time 67.6 seconds.
 </pre></div>
 </div>
 <img alt="../../_images/92bd7f94dd548c8cfc2744eb5890cd23fada1ff98e8dc907657e2eb109af0402.png" src="../../_images/92bd7f94dd548c8cfc2744eb5890cd23fada1ff98e8dc907657e2eb109af0402.png" />
@@ -921,7 +921,7 @@ <h2>Setting up the configuration yaml file<a class="headerlink" href="#setting-u
                    <span class="n">update</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;gpus&#39;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
                            <span class="s1">&#39;task.dataset&#39;</span><span class="p">:</span> <span class="s1">&#39;ase_db&#39;</span><span class="p">,</span>
                            <span class="s1">&#39;optim.eval_every&#39;</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
-                           <span class="s1">&#39;optim.max_epochs&#39;</span><span class="p">:</span> <span class="mi">10</span><span class="p">,</span>
+                           <span class="s1">&#39;optim.max_epochs&#39;</span><span class="p">:</span> <span class="mi">50</span><span class="p">,</span>
                             <span class="s1">&#39;logger&#39;</span><span class="p">:</span><span class="s1">&#39;tensorboard&#39;</span><span class="p">,</span> <span class="c1"># don&#39;t use wandb!</span>
                            <span class="c1"># Train data</span>
                            <span class="s1">&#39;dataset.train.src&#39;</span><span class="p">:</span> <span class="s1">&#39;train.db&#39;</span><span class="p">,</span>
@@ -1074,7 +1074,7 @@ <h2>Setting up the configuration yaml file<a class="headerlink" href="#setting-u
   load_balancing: atoms
   loss_energy: mae
   lr_initial: 0.0005
-  max_epochs: 10
+  max_epochs: 50
   mode: min
   num_workers: 2
   optimizer: AdamW
@@ -1130,7 +1130,7 @@ <h2>Running the training job<a class="headerlink" href="#running-the-training-jo
 <span class="expanded">Hide code cell output</span>
 </summary>
 <div class="cell_output docutils container">
-<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Elapsed time = 26.5 seconds
+<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>Elapsed time = 27.2 seconds
 </pre></div>
 </div>
 </div>
@@ -1146,7 +1146,7 @@ <h2>Running the training job<a class="headerlink" href="#running-the-training-jo
 </div>
 </div>
 <div class="cell_output docutils container">
-<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>&#39;fine-tuning/checkpoints/2024-04-15-00-36-16-ft-oxides&#39;
+<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>&#39;fine-tuning/checkpoints/2024-04-15-02-37-52-ft-oxides&#39;
 </pre></div>
 </div>
 </div>

diff --git a/core/gotchas.html b/core/gotchas.html
@@ -929,7 +929,7 @@ <h1>I get wildly different energies from the different models<a class="headerlin
   warnings.warn(
 </pre></div>
 </div>
-<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1.6814193725585938
+<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1.6757900714874268
 </pre></div>
 </div>
 </div>
@@ -1433,7 +1433,7 @@ <h1>To tag or not?<a class="headerlink" href="#to-tag-or-not" title="Link to thi
   warnings.warn(
 </pre></div>
 </div>
-<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>-0.42973700165748596
+<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>-0.42973676323890686
 </pre></div>
 </div>
 </div>
@@ -1483,17 +1483,17 @@ <h1>Stochastic simulation results<a class="headerlink" href="#stochastic-simulat
   warnings.warn(
 </pre></div>
 </div>
-<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1.2139861106872558 1.912705528271826e-06
-1.2139849662780762
-1.213987112045288
+<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>1.2139864921569825 1.5713936856540758e-06
+1.2139854431152344
+1.2139854431152344
+1.2139859199523926
 1.2139880657196045
-1.2139849662780762
-1.2139861583709717
-1.213982343673706
 1.2139842510223389
+1.2139878273010254
+1.2139880657196045
+1.2139852046966553
 1.2139892578125
-1.213987112045288
-1.213986873626709
+1.2139854431152344
 </pre></div>
 </div>
 </div>
@@ -1536,7 +1536,7 @@ <h1>The forces don’t sum to zero<a class="headerlink" href="#the-forces-don-t-
   warnings.warn(
 </pre></div>
 </div>
-<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>array([ 0.00847746,  0.01409757, -0.05882728], dtype=float32)
+<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>array([ 0.00848202,  0.01409698, -0.05882537], dtype=float32)
 </pre></div>
 </div>
 </div>
@@ -1549,7 +1549,7 @@ <h1>The forces don’t sum to zero<a class="headerlink" href="#the-forces-don-t-
 </div>
 </div>
 <div class="cell_output docutils container">
-<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>array([ 1.1449447e-07,  2.2817403e-08, -5.9604645e-07], dtype=float32)
+<div class="output text_plain highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>array([ 8.5332431e-08,  9.1269612e-08, -2.3841858e-07], dtype=float32)
 </pre></div>
 </div>
 </div>