Skip to content

Commit

Permalink
deploy: dda32e7
Browse files Browse the repository at this point in the history
  • Loading branch information
zulissimeta committed Apr 14, 2024
1 parent 7f773ef commit 71b71e9
Show file tree
Hide file tree
Showing 32 changed files with 1,771 additions and 2,723 deletions.
85 changes: 45 additions & 40 deletions _downloads/5fdddbed2260616231dbf7b0d94bb665/train.txt
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
2024-04-14 19:19:53 (INFO): Project root: /home/runner/work/ocp/ocp
2024-04-14 21:25:50 (INFO): Project root: /home/runner/work/ocp/ocp
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling.
warnings.warn(
2024-04-14 19:19:54 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
2024-04-14 19:19:54 (INFO): amp: true
2024-04-14 21:25:51 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
2024-04-14 21:25:51 (INFO): amp: true
cmd:
checkpoint_dir: fine-tuning/checkpoints/2024-04-14-19-20-32-ft-oxides
commit: 87e111f
checkpoint_dir: fine-tuning/checkpoints/2024-04-14-21-26-24-ft-oxides
commit: dda32e7
identifier: ft-oxides
logs_dir: fine-tuning/logs/wandb/2024-04-14-19-20-32-ft-oxides
logs_dir: fine-tuning/logs/tensorboard/2024-04-14-21-26-24-ft-oxides
print_every: 10
results_dir: fine-tuning/results/2024-04-14-19-20-32-ft-oxides
results_dir: fine-tuning/results/2024-04-14-21-26-24-ft-oxides
seed: 0
timestamp_id: 2024-04-14-19-20-32-ft-oxides
timestamp_id: 2024-04-14-21-26-24-ft-oxides
dataset:
a2g_args:
r_energy: true
Expand All @@ -35,7 +35,7 @@ eval_metrics:
misc:
- energy_forces_within_threshold
gpus: 0
logger: wandb
logger: tensorboard
loss_fns:
- energy:
coefficient: 1
Expand Down Expand Up @@ -142,37 +142,42 @@ val_dataset:
r_forces: true
src: val.db

wandb: ERROR api_key not configured (no-tty). call wandb.login(key=[your_api_key])
2024-04-14 21:25:51 (INFO): Loading dataset: ase_db
2024-04-14 21:25:51 (INFO): rank: 0: Sampler created...
2024-04-14 21:25:51 (INFO): Batch balancing is disabled for single GPU training.
2024-04-14 21:25:51 (INFO): rank: 0: Sampler created...
2024-04-14 21:25:51 (INFO): Batch balancing is disabled for single GPU training.
2024-04-14 21:25:51 (INFO): rank: 0: Sampler created...
2024-04-14 21:25:51 (INFO): Batch balancing is disabled for single GPU training.
2024-04-14 21:25:51 (INFO): Loading model: gemnet_oc
2024-04-14 21:25:51 (WARNING): Unrecognized arguments: ['symmetric_edge_symmetrization']
2024-04-14 21:25:54 (INFO): Loaded GemNetOC with 38864438 parameters.
2024-04-14 21:25:54 (WARNING): Model gradient logging to tensorboard not yet supported.
2024-04-14 21:25:54 (WARNING): Using `weight_decay` from `optim` instead of `optim.optimizer_params`.Please update your config to use `optim.optimizer_params.weight_decay`.`optim.weight_decay` will soon be deprecated.
2024-04-14 21:25:54 (INFO): Loading checkpoint from: /tmp/ocp_checkpoints/gnoc_oc22_oc20_all_s2ef.pt
2024-04-14 21:25:54 (INFO): Overwriting scaling factors with those loaded from checkpoint. If you're generating predictions with a pretrained checkpoint, this is the correct behavior. To disable this, delete `scale_dict` from the checkpoint.
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/amp/autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
warnings.warn(
2024-04-14 21:26:06 (INFO): Evaluating on val.
device 0: 0%| | 0/2 [00:00<?, ?it/s]/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch_geometric/data/collate.py:145: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
device 0: 50%|█████ | 1/2 [00:04<00:04, 4.27s/it]device 0: 100%|██████████| 2/2 [00:06<00:00, 3.21s/it]device 0: 100%|██████████| 2/2 [00:06<00:00, 3.43s/it]
2024-04-14 21:26:13 (INFO): energy_forces_within_threshold: 0.0000, energy_mae: 2.8244, forcesx_mae: 0.0080, forcesy_mae: 0.0105, forcesz_mae: 0.0081, forces_mae: 0.0089, forces_cosine_similarity: 0.1907, forces_magnitude_error: 0.0127, loss: 2.8302, epoch: 0.0667
Traceback (most recent call last):
File "/home/runner/work/ocp/ocp/main.py", line 89, in <module>
Runner()(config)
File "/home/runner/work/ocp/ocp/main.py", line 34, in __call__
with new_trainer_context(args=args, config=config) as ctx:
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/contextlib.py", line 137, in __enter__
return next(self.gen)
^^^^^^^^^^^^^^
File "/home/runner/work/ocp/ocp/ocpmodels/common/utils.py", line 977, in new_trainer_context
trainer = trainer_cls(
^^^^^^^^^^^^
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/ocp_trainer.py", line 95, in __init__
super().__init__(
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 176, in __init__
self.load()
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 197, in load
self.load_logger()
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 229, in load_logger
self.logger = registry.get_logger_class(logger_name)(self.config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ocp/ocp/ocpmodels/common/logger.py", line 65, in __init__
wandb.init(
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 1200, in init
raise e
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 1177, in init
wi.setup(kwargs)
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/wandb/sdk/wandb_init.py", line 301, in setup
wandb_login._login(
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/wandb/sdk/wandb_login.py", line 334, in _login
wlogin.prompt_api_key()
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/wandb/sdk/wandb_login.py", line 263, in prompt_api_key
raise UsageError("api_key not configured (no-tty). call " + directive)
wandb.errors.UsageError: api_key not configured (no-tty). call wandb.login(key=[your_api_key])
File "/home/runner/work/ocp/ocp/main.py", line 40, in __call__
self.task.run()
File "/home/runner/work/ocp/ocp/ocpmodels/tasks/task.py", line 51, in run
self.trainer.train(
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/ocp_trainer.py", line 200, in train
self.update_best(
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 667, in update_best
"mae" in primary_metric
TypeError: argument of type 'NoneType' is not iterable
Expand Down
148 changes: 148 additions & 0 deletions _downloads/819e10305ddd6839cd7da05935b17060/mass-inference.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
2024-04-14 21:28:11 (INFO): Project root: /home/runner/work/ocp/ocp
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling.
warnings.warn(
2024-04-14 21:28:13 (WARNING): Detected old config, converting to new format. Consider updating to avoid potential incompatibilities.
2024-04-14 21:28:13 (INFO): amp: true
cmd:
checkpoint_dir: ./checkpoints/2024-04-14-21-28-32
commit: dda32e7
identifier: ''
logs_dir: ./logs/tensorboard/2024-04-14-21-28-32
print_every: 10
results_dir: ./results/2024-04-14-21-28-32
seed: 0
timestamp_id: 2024-04-14-21-28-32
dataset:
a2g_args:
r_energy: false
r_forces: false
format: ase_db
key_mapping:
force: forces
y: energy
select_args:
selection: natoms>5,xc=PBE
src: data.db
eval_metrics:
metrics:
energy:
- mae
forces:
- forcesx_mae
- forcesy_mae
- forcesz_mae
- mae
- cosine_similarity
- magnitude_error
misc:
- energy_forces_within_threshold
gpus: 0
logger: tensorboard
loss_fns:
- energy:
coefficient: 1
fn: mae
- forces:
coefficient: 1
fn: l2mae
model: gemnet_t
model_attributes:
activation: silu
cbf:
name: spherical_harmonics
cutoff: 6.0
direct_forces: true
emb_size_atom: 512
emb_size_bil_trip: 64
emb_size_cbf: 16
emb_size_edge: 512
emb_size_rbf: 16
emb_size_trip: 64
envelope:
exponent: 5
name: polynomial
extensive: true
max_neighbors: 50
num_after_skip: 2
num_atom: 3
num_before_skip: 1
num_blocks: 3
num_concat: 1
num_radial: 128
num_spherical: 7
otf_graph: true
output_init: HeOrthogonal
rbf:
name: gaussian
regress_forces: true
noddp: false
optim:
batch_size: 16
clip_grad_norm: 10
ema_decay: 0.999
energy_coefficient: 1
eval_batch_size: 16
eval_every: 5000
force_coefficient: 1
loss_energy: mae
loss_force: atomwisel2
lr_gamma: 0.8
lr_initial: 0.0005
lr_milestones:
- 64000
- 96000
- 128000
- 160000
- 192000
max_epochs: 80
num_workers: 2
optimizer: AdamW
optimizer_params:
amsgrad: true
warmup_steps: -1
outputs:
energy:
level: system
forces:
eval_on_free_atoms: true
level: atom
train_on_free_atoms: false
slurm: {}
task:
dataset: ase_db
prediction_dtype: float32
test_dataset:
a2g_args:
r_energy: false
r_forces: false
select_args:
selection: natoms>5,xc=PBE
src: data.db
trainer: ocp
val_dataset: null

2024-04-14 21:28:13 (INFO): Loading dataset: ase_db
Traceback (most recent call last):
File "/home/runner/work/ocp/ocp/main.py", line 89, in <module>
Runner()(config)
File "/home/runner/work/ocp/ocp/main.py", line 34, in __call__
with new_trainer_context(args=args, config=config) as ctx:
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/contextlib.py", line 137, in __enter__
return next(self.gen)
^^^^^^^^^^^^^^
File "/home/runner/work/ocp/ocp/ocpmodels/common/utils.py", line 977, in new_trainer_context
trainer = trainer_cls(
^^^^^^^^^^^^
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/ocp_trainer.py", line 95, in __init__
super().__init__(
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 176, in __init__
self.load()
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 198, in load
self.load_datasets()
File "/home/runner/work/ocp/ocp/ocpmodels/trainers/base_trainer.py", line 281, in load_datasets
self.train_dataset = registry.get_dataset_class(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/runner/work/ocp/ocp/ocpmodels/datasets/ase_datasets.py", line 114, in __init__
raise ValueError(
ValueError: No valid ase data found!Double check that the src path and/or glob search pattern gives ASE compatible data: data.db
Elapsed time = 3.8 seconds
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
1 change: 1 addition & 0 deletions _sources/core/fine-tuning/fine-tuning-oxides.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,7 @@ yml = generate_yml_config(checkpoint_path, 'config.yml',
'task.dataset': 'ase_db',
'optim.eval_every': 1,
'optim.max_epochs': 10,
'logger':'tensorboard', # don't use wandb!
# Train data
'dataset.train.src': 'train.db',
'dataset.train.a2g_args.r_energy': True,
Expand Down
13 changes: 8 additions & 5 deletions _sources/core/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,25 +26,28 @@ You can retrieve the dataset below. In this notebook we learn how to do "mass in
! [ ! -f data.db ] && wget https://figshare.com/ndownloader/files/11948267 -O data.db
```

```{code-cell} ipython3
! ase db data.db
```


Inference on this file will be fast if we have a gpu, but if we don't this could take a while. To keep things fast for the automated builds, we'll just select the first 10 structures so it's still approachable with just a CPU.
Comment or skip this block to use the whole dataset!

```{code-cell} ipython3
! cp data.db full_data.db
! mv data.db full_data.db
import ase.db
import numpy as np
with ase.db.connect('full_data.db') as full_db:
with ase.db.connect('data.db') as subset_db:
with ase.db.connect('data.db',append=False) as subset_db:
for i in range(1, 10):
subset_db.write(full_db.get_atoms(i))
```

```{code-cell} ipython3
! ase db data.db
```

You have to choose a checkpoint to start with. The newer checkpoints may require too much memory for this environment.

```{code-cell} ipython3
Expand Down
11 changes: 6 additions & 5 deletions _sources/tutorials/NRR/NRR_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,21 +172,22 @@ These steps are embarrassingly parallel, and can be launched that way to speed t

The goal here is to relax each candidate adsorption geometry and save the results in a trajectory file we will analyze later. Each trajectory file will have the geometry and final energy of the relaxed structure.

It is somewhat time consuming to run this, so in this cell we only run one example.
It is somewhat time consuming to run this, so in this cell we only run one example, and just the first 4 configurations for each adsorbate.

```{code-cell} ipython3
import time
from tqdm import tqdm
tinit = time.time()
for bulk_src_id in tqdm(bulk_ids[1:2]):
# Note we're just doing the first bulk_id!
for bulk_src_id in tqdm(bulk_ids[:1]):
# Enumerate slabs and establish adsorbates
bulk = Bulk(bulk_src_id_from_db=bulk_src_id, bulk_db_path="NRR_example_bulks.pkl")
slab = Slab.from_bulk_get_specific_millers(bulk= bulk, specific_millers=(1, 1, 1))
# Perform heuristic placements
heuristic_adslabs_H = AdsorbateSlabConfig(slab[0], adsorbate_H, mode="heuristic")
heuristic_adslabs_NNH = AdsorbateSlabConfig(slab[0], adsorbate_NNH, mode="heuristic")
# Perform heuristic placements, note just 4 configs!
heuristic_adslabs_H = AdsorbateSlabConfig(slab[0], adsorbate_H, mode="heuristic")[:4]
heuristic_adslabs_NNH = AdsorbateSlabConfig(slab[0], adsorbate_NNH, mode="heuristic")[:4]
#Run relaxations
os.makedirs(f"data/{bulk_src_id}_H", exist_ok=True)
Expand Down
2 changes: 1 addition & 1 deletion _sources/tutorials/advanced/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,5 +320,5 @@ found.get_distance(0, 2), found.get_distance(1, 2)

```{code-cell} ipython3
from ase.visualize.plot import plot_atoms
plot_atoms(found);
plot_atoms(found)
```
Loading

0 comments on commit 71b71e9

Please sign in to comment.