Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r3.0] release v3.0.1 #4487

Merged
merged 13 commits into from
Dec 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 16 additions & 16 deletions CITATIONS.bib
Original file line number Diff line number Diff line change
Expand Up @@ -128,26 +128,26 @@ @article{Zhang_NpjComputMater_2024_v10_p94
doi = {10.1038/s41524-024-01278-7},
}

@misc{Zhang_2023_DPA2,
@article{Zhang_npjComputMater_2024_v10_p293,
annote = {DPA-2},
author = {
Duo Zhang and Xinzijian Liu and Xiangyu Zhang and Chengqian Zhang and Chun
Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Jiameng Huang and
Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang and Siyuan Liu and
Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou and Jianchuan Liu
and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and Jing Wu and Yudi Yang
and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and Linshuang Zhang and
Mengchao Shi and Fu-Zhi Dai and Darrin M. York and Shi Liu and Tong Zhu and
Zhicheng Zhong and Jian Lv and Jun Cheng and Weile Jia and Mohan Chen and
Guolin Ke and Weinan E and Linfeng Zhang and Han Wang
Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Anyang Peng and
Jiameng Huang and Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang
and Siyuan Liu and Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou
and Jianchuan Liu and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and
Jing Wu and Yudi Yang and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and
Linshuang Zhang and Mengchao Shi and Fu-Zhi Dai and Darrin M. York and Shi
Liu and Tong Zhu and Zhicheng Zhong and Jian Lv and Jun Cheng and Weile Jia
and Mohan Chen and Guolin Ke and Weinan E and Linfeng Zhang and Han Wang
},
title = {
{DPA-2: Towards a universal large atomic model for molecular and material
simulation}
},
publisher = {arXiv},
year = 2023,
doi = {10.48550/arXiv.2312.15492},
title = {{DPA-2: a large atomic model as a multi-task learner}},
journal = {npj Comput. Mater},
year = 2024,
volume = 10,
number = 1,
pages = 293,
doi = {10.1038/s41524-024-01493-2},
}

@article{Zhang_PhysPlasmas_2020_v27_p122704,
Expand Down
11 changes: 11 additions & 0 deletions deepmd/dpmodel/atomic_model/pairtab_atomic_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -425,3 +425,14 @@
If False, the shape is (nframes, nloc, ndim).
"""
return False

def enable_compression(
self,
min_nbor_dist: float,
table_extrapolate: float = 5,
table_stride_1: float = 0.01,
table_stride_2: float = 0.1,
check_frequency: int = -1,
) -> None:
"""Pairtab model does not support compression."""
pass

Check warning on line 438 in deepmd/dpmodel/atomic_model/pairtab_atomic_model.py

View check run for this annotation

Codecov / codecov/patch

deepmd/dpmodel/atomic_model/pairtab_atomic_model.py#L438

Added line #L438 was not covered by tests
7 changes: 6 additions & 1 deletion deepmd/dpmodel/descriptor/dpa2.py
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ def __init__(
use_tebd_bias: bool = False,
type_map: Optional[list[str]] = None,
) -> None:
r"""The DPA-2 descriptor. see https://arxiv.org/abs/2312.15492.
r"""The DPA-2 descriptor[1]_.
Parameters
----------
Expand Down Expand Up @@ -434,6 +434,11 @@ def __init__(
sw: torch.Tensor
The switch function for decaying inverse distance.
References
----------
.. [1] Zhang, D., Liu, X., Zhang, X. et al. DPA-2: a
large atomic model as a multi-task learner. npj
Comput Mater 10, 293 (2024). https://doi.org/10.1038/s41524-024-01493-2
"""

def init_subclass_params(sub_data, sub_class):
Expand Down
11 changes: 11 additions & 0 deletions deepmd/pt/model/atomic_model/pairtab_atomic_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -484,3 +484,14 @@
If False, the shape is (nframes, nloc, ndim).
"""
return False

def enable_compression(
self,
min_nbor_dist: float,
table_extrapolate: float = 5,
table_stride_1: float = 0.01,
table_stride_2: float = 0.1,
check_frequency: int = -1,
) -> None:
"""Pairtab model does not support compression."""
pass

Check warning on line 497 in deepmd/pt/model/atomic_model/pairtab_atomic_model.py

View check run for this annotation

Codecov / codecov/patch

deepmd/pt/model/atomic_model/pairtab_atomic_model.py#L497

Added line #L497 was not covered by tests
7 changes: 6 additions & 1 deletion deepmd/pt/model/descriptor/dpa2.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def __init__(
use_tebd_bias: bool = False,
type_map: Optional[list[str]] = None,
) -> None:
r"""The DPA-2 descriptor. see https://arxiv.org/abs/2312.15492.
r"""The DPA-2 descriptor[1]_.

Parameters
----------
Expand Down Expand Up @@ -147,6 +147,11 @@ def __init__(
sw: torch.Tensor
The switch function for decaying inverse distance.

References
----------
.. [1] Zhang, D., Liu, X., Zhang, X. et al. DPA-2: a
large atomic model as a multi-task learner. npj
Comput Mater 10, 293 (2024). https://doi.org/10.1038/s41524-024-01493-2
"""
super().__init__()

Expand Down
4 changes: 3 additions & 1 deletion deepmd/pt/model/model/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ def get_zbl_model(model_params):
rmax = model_params["sw_rmax"]
atom_exclude_types = model_params.get("atom_exclude_types", [])
pair_exclude_types = model_params.get("pair_exclude_types", [])
return DPZBLModel(
model = DPZBLModel(
dp_model,
pt_model,
rmin,
Expand All @@ -205,6 +205,8 @@ def get_zbl_model(model_params):
atom_exclude_types=atom_exclude_types,
pair_exclude_types=pair_exclude_types,
)
model.model_def_script = json.dumps(model_params)
return model


def _can_be_converted_to_float(value) -> Optional[bool]:
Expand Down
42 changes: 16 additions & 26 deletions deepmd/pt/train/training.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ def warm_up_linear(step, warmup_steps):
# author: iProzd
if self.opt_type == "Adam":
self.optimizer = torch.optim.Adam(
self.wrapper.parameters(), lr=self.lr_exp.start_lr
self.wrapper.parameters(), lr=self.lr_exp.start_lr, fused=True
)
if optimizer_state_dict is not None and self.restart_training:
self.optimizer.load_state_dict(optimizer_state_dict)
Expand Down Expand Up @@ -653,10 +653,15 @@ def run(self) -> None:
prof.start()

def step(_step_id, task_key="Default") -> None:
if self.multi_task:
model_index = dp_random.choice(
np.arange(self.num_model, dtype=np.int_),
p=self.model_prob,
)
task_key = self.model_keys[model_index]
# PyTorch Profiler
if self.enable_profiler or self.profiling:
prof.step()
self.wrapper.train()
if isinstance(self.lr_exp, dict):
_lr = self.lr_exp[task_key]
else:
Expand All @@ -682,12 +687,11 @@ def step(_step_id, task_key="Default") -> None:
)
loss.backward()
if self.gradient_max_norm > 0.0:
grad_norm = torch.nn.utils.clip_grad_norm_(
self.wrapper.parameters(), self.gradient_max_norm
torch.nn.utils.clip_grad_norm_(
self.wrapper.parameters(),
self.gradient_max_norm,
error_if_nonfinite=True,
)
if not torch.isfinite(grad_norm).all():
# check local gradnorm single GPU case, trigger NanDetector
raise FloatingPointError("gradients are Nan/Inf")
with torch.device("cpu"):
self.optimizer.step()
self.scheduler.step()
Expand Down Expand Up @@ -766,7 +770,7 @@ def fake_model():
if self.display_in_training and (
display_step_id % self.disp_freq == 0 or display_step_id == 1
):
self.wrapper.eval()
self.wrapper.eval() # Will set to train mode before fininshing validation

def log_loss_train(_loss, _more_loss, _task_key="Default"):
results = {}
Expand Down Expand Up @@ -872,6 +876,7 @@ def log_loss_valid(_task_key="Default"):
learning_rate=None,
)
)
self.wrapper.train()

current_time = time.time()
train_time = current_time - self.t0
Expand Down Expand Up @@ -927,26 +932,11 @@ def log_loss_valid(_task_key="Default"):
f"{task_key}/{item}", more_loss[item], display_step_id
)

self.wrapper.train()
self.t0 = time.time()
self.total_train_time = 0.0
for step_id in range(self.num_steps):
if step_id < self.start_step:
continue
if self.multi_task:
chosen_index_list = dp_random.choice(
np.arange(
self.num_model, dtype=np.int32
), # int32 should be enough for # models...
p=np.array(self.model_prob),
size=self.world_size,
replace=True,
)
assert chosen_index_list.size == self.world_size
model_index = chosen_index_list[self.rank]
model_key = self.model_keys[model_index]
else:
model_key = "Default"
step(step_id, model_key)
for step_id in range(self.start_step, self.num_steps):
step(step_id)
if JIT:
break

Expand Down
30 changes: 17 additions & 13 deletions deepmd/pt/utils/dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
)

from deepmd.pt.utils import (
dp_random,
env,
)
from deepmd.pt.utils.dataset import (
Expand All @@ -50,6 +51,7 @@ def setup_seed(seed) -> None:
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
dp_random.seed(seed)


class DpLoaderSet(Dataset):
Expand Down Expand Up @@ -185,19 +187,21 @@ def print_summary(
name: str,
prob: list[float],
) -> None:
print_summary(
name,
len(self.systems),
[ss.system for ss in self.systems],
[ss._natoms for ss in self.systems],
self.batch_sizes,
[
ss._data_system.get_sys_numb_batch(self.batch_sizes[ii])
for ii, ss in enumerate(self.systems)
],
prob,
[ss._data_system.pbc for ss in self.systems],
)
rank = dist.get_rank() if dist.is_initialized() else 0
if rank == 0:
print_summary(
name,
len(self.systems),
[ss.system for ss in self.systems],
[ss._natoms for ss in self.systems],
self.batch_sizes,
[
ss._data_system.get_sys_numb_batch(self.batch_sizes[ii])
for ii, ss in enumerate(self.systems)
],
prob,
[ss._data_system.pbc for ss in self.systems],
)


_sentinel = object()
Expand Down
2 changes: 2 additions & 0 deletions deepmd/tf/descriptor/se_r.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,8 @@ def enable_compression(
self.filter_neuron,
graph,
graph_def,
type_one_side=self.type_one_side,
exclude_types=self.exclude_types,
activation_fn=self.filter_activation_fn,
suffix=suffix,
)
Expand Down
2 changes: 2 additions & 0 deletions deepmd/utils/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@
) -> None:
"""Constructor."""
root = DPPath(sys_path)
if not root.is_dir():
raise FileNotFoundError(f"System {sys_path} is not found!")

Check warning on line 64 in deepmd/utils/data.py

View check run for this annotation

Codecov / codecov/patch

deepmd/utils/data.py#L64

Added line #L64 was not covered by tests
self.dirs = root.glob(set_prefix + ".*")
if not len(self.dirs):
raise FileNotFoundError(f"No {set_prefix}.* is found in {sys_path}")
Expand Down
22 changes: 2 additions & 20 deletions deepmd/utils/data_system.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,6 @@
from deepmd.utils.out_stat import (
compute_stats_from_redu,
)
from deepmd.utils.path import (
DPPath,
)

log = logging.getLogger(__name__)

Expand Down Expand Up @@ -103,6 +100,8 @@
del rcut
self.system_dirs = systems
self.nsystems = len(self.system_dirs)
if self.nsystems <= 0:
raise ValueError("No systems provided")

Check warning on line 104 in deepmd/utils/data_system.py

View check run for this annotation

Codecov / codecov/patch

deepmd/utils/data_system.py#L104

Added line #L104 was not covered by tests
self.data_systems = []
for ii in self.system_dirs:
self.data_systems.append(
Expand Down Expand Up @@ -755,23 +754,6 @@
systems = expand_sys_str(systems)
elif isinstance(systems, list):
systems = systems.copy()
help_msg = "Please check your setting for data systems"
# check length of systems
if len(systems) == 0:
msg = "cannot find valid a data system"
log.fatal(msg)
raise OSError(msg, help_msg)
# roughly check all items in systems are valid
for ii in systems:
ii = DPPath(ii)
if not ii.is_dir():
msg = f"dir {ii} is not a valid dir"
log.fatal(msg)
raise OSError(msg, help_msg)
if not (ii / "type.raw").is_file():
msg = f"dir {ii} is not a valid data system dir"
log.fatal(msg)
raise OSError(msg, help_msg)
return systems


Expand Down
12 changes: 6 additions & 6 deletions deepmd/utils/path.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from typing import (
ClassVar,
Optional,
Union,
)

import h5py
Expand Down Expand Up @@ -151,25 +152,22 @@
If true, no error will be raised if the target directory already exists.
"""


Check warning

Code scanning / CodeQL

`__eq__` not overridden when adding attributes Warning

The class 'DPOSPath' does not override
'__eq__'
, but adds the new attribute
mode
.
The class 'DPOSPath' does not override
'__eq__'
, but adds the new attribute
path
.
class DPOSPath(DPPath):
"""The OS path class to data system (DeepmdData) for real directories.

Parameters
----------
path : str
path : Union[str, Path]
path
mode : str, optional
mode, by default "r"
"""

def __init__(self, path: str, mode: str = "r") -> None:
def __init__(self, path: Union[str, Path], mode: str = "r") -> None:
super().__init__()
self.mode = mode
if isinstance(path, Path):
self.path = path
else:
self.path = Path(path)
self.path = Path(path)

def load_numpy(self) -> np.ndarray:
"""Load NumPy array.
Expand Down Expand Up @@ -300,6 +298,8 @@
# so we do not support file names containing #...
s = path.split("#")
self.root_path = s[0]
if not os.path.isfile(self.root_path):
raise FileNotFoundError(f"{self.root_path} not found")

Check warning on line 302 in deepmd/utils/path.py

View check run for this annotation

Codecov / codecov/patch

deepmd/utils/path.py#L302

Added line #L302 was not covered by tests
self.root = self._load_h5py(s[0], mode)
# h5 path: default is the root path
self._name = s[1] if len(s) > 1 else "/"
Expand Down
2 changes: 1 addition & 1 deletion doc/credits.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Cite DeePMD-kit and methods
.. bibliography::
:filter: False

Zhang_2023_DPA2
Zhang_npjComputMater_2024_v10_p293

- If frame-specific parameters (`fparam`, e.g. electronic temperature) is used,

Expand Down
2 changes: 1 addition & 1 deletion doc/development/create-a-model-pt.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ The arguments here should be consistent with the class arguments of your new com
## Package new codes

You may package new codes into a new Python package if you don't want to contribute it to the main DeePMD-kit repository.
A good example is [DeePMD-GNN](https://github.com/njzjz/deepmd-gnn).
A good example is [DeePMD-GNN](https://gitlab.com/RutgersLBSR/deepmd-gnn).
It's crucial to add your new component to `project.entry-points."deepmd.pt"` in `pyproject.toml`:

```toml
Expand Down
2 changes: 1 addition & 1 deletion doc/install/install-from-c-library.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Install from pre-compiled C library {{ tensorflow_icon }}, JAX {{ jax_icon }}
# Install from pre-compiled C library {{ tensorflow_icon }} {{ jax_icon }}

:::{note}
**Supported backends**: TensorFlow {{ tensorflow_icon }}, JAX {{ jax_icon }}
Expand Down
Loading
Loading