docs: update DPA-2 citation (#4483)

## Summary by CodeRabbit - **New Features** - Updated references in the bibliography for the DPA-2 model to include a new article entry for 2024. - Added a new reference for an attention-based descriptor. - **Bug Fixes** - Corrected reference links in documentation to point to updated DOI links instead of arXiv. - **Documentation** - Revised entries in the credits and model documentation to reflect the latest citations and details. - Enhanced clarity and detail in fine-tuning documentation for TensorFlow and PyTorch implementations.  --------- Signed-off-by: Jinzhe Zeng <[email protected]> (cherry picked from commit deaeec9)
deepmodeling · Dec 23, 2024 · 89127c9 · 89127c9
1 parent 3c2db6a
commit 89127c9
Show file tree

Hide file tree

Showing 7 changed files with 32 additions and 22 deletions.
diff --git a/CITATIONS.bib b/CITATIONS.bib
@@ -128,26 +128,26 @@ @article{Zhang_NpjComputMater_2024_v10_p94
   doi          = {10.1038/s41524-024-01278-7},
 }
 
-@misc{Zhang_2023_DPA2,
+@article{Zhang_npjComputMater_2024_v10_p293,
   annote       = {DPA-2},
   author       = {
     Duo Zhang and Xinzijian Liu and Xiangyu Zhang and Chengqian Zhang and Chun
-    Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Jiameng Huang and
-    Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang and Siyuan Liu and
-    Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou and Jianchuan Liu
-    and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and Jing Wu and Yudi Yang
-    and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and Linshuang Zhang and
-    Mengchao Shi and Fu-Zhi Dai and Darrin M. York and Shi Liu and Tong Zhu and
-    Zhicheng Zhong and Jian Lv and Jun Cheng and Weile Jia and Mohan Chen and
-    Guolin Ke and Weinan E and Linfeng Zhang and Han Wang
+    Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Anyang Peng and
+    Jiameng Huang and Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang
+    and Siyuan Liu and Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou
+    and Jianchuan Liu and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and
+    Jing Wu and Yudi Yang and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and
+    Linshuang Zhang and Mengchao Shi and Fu-Zhi Dai and Darrin M. York and Shi
+    Liu and Tong Zhu and Zhicheng Zhong and Jian Lv and Jun Cheng and Weile Jia
+    and Mohan Chen and Guolin Ke and Weinan E and Linfeng Zhang and Han Wang
   },
-  title        = {
-    {DPA-2: Towards a universal large atomic model for molecular and material
-    simulation}
-  },
-  publisher    = {arXiv},
-  year         = 2023,
-  doi          = {10.48550/arXiv.2312.15492},
+  title        = {{DPA-2: a large atomic model as a multi-task learner}},
+  journal      = {npj Comput. Mater},
+  year         = 2024,
+  volume       = 10,
+  number       = 1,
+  pages        = 293,
+  doi          = {10.1038/s41524-024-01493-2},
 }
 
 @article{Zhang_PhysPlasmas_2020_v27_p122704,

diff --git a/deepmd/dpmodel/descriptor/dpa2.py b/deepmd/dpmodel/descriptor/dpa2.py
@@ -387,7 +387,7 @@ def __init__(
         use_tebd_bias: bool = False,
         type_map: Optional[list[str]] = None,
     ) -> None:
-        r"""The DPA-2 descriptor. see https://arxiv.org/abs/2312.15492.
+        r"""The DPA-2 descriptor[1]_.
 
         Parameters
         ----------
@@ -434,6 +434,11 @@ def __init__(
         sw:                 torch.Tensor
             The switch function for decaying inverse distance.
 
+        References
+        ----------
+        .. [1] Zhang, D., Liu, X., Zhang, X. et al. DPA-2: a
+           large atomic model as a multi-task learner. npj
+           Comput Mater 10, 293 (2024). https://doi.org/10.1038/s41524-024-01493-2
         """
 
         def init_subclass_params(sub_data, sub_class):

diff --git a/deepmd/pt/model/descriptor/dpa2.py b/deepmd/pt/model/descriptor/dpa2.py
@@ -100,7 +100,7 @@ def __init__(
         use_tebd_bias: bool = False,
         type_map: Optional[list[str]] = None,
     ) -> None:
-        r"""The DPA-2 descriptor. see https://arxiv.org/abs/2312.15492.
+        r"""The DPA-2 descriptor[1]_.
 
         Parameters
         ----------
@@ -147,6 +147,11 @@ def __init__(
         sw:                 torch.Tensor
             The switch function for decaying inverse distance.
 
+        References
+        ----------
+        .. [1] Zhang, D., Liu, X., Zhang, X. et al. DPA-2: a
+           large atomic model as a multi-task learner. npj
+           Comput Mater 10, 293 (2024). https://doi.org/10.1038/s41524-024-01493-2
         """
         super().__init__()
 

diff --git a/doc/credits.rst b/doc/credits.rst
@@ -54,7 +54,7 @@ Cite DeePMD-kit and methods
 .. bibliography::
    :filter: False
 
-   Zhang_2023_DPA2
+   Zhang_npjComputMater_2024_v10_p293
 
 - If frame-specific parameters (`fparam`, e.g. electronic temperature) is used,
 

diff --git a/doc/model/dpa2.md b/doc/model/dpa2.md
@@ -4,7 +4,7 @@
 **Supported backends**: PyTorch {{ pytorch_icon }}, JAX {{ jax_icon }}, DP {{ dpmodel_icon }}
 :::
 
-The DPA-2 model implementation. See https://arxiv.org/abs/2312.15492 for more details.
+The DPA-2 model implementation. See https://doi.org/10.1038/s41524-024-01493-2 for more details.
 
 Training example: `examples/water/dpa2/input_torch_medium.json`, see [README](../../examples/water/dpa2/README.md) for inputs in different levels.
 

diff --git a/doc/train/finetuning.md b/doc/train/finetuning.md
@@ -94,7 +94,7 @@ The model section will be overwritten (except the `type_map` subsection) by that
 
 #### Fine-tuning from a multi-task pre-trained model
 
-Additionally, within the PyTorch implementation and leveraging the flexibility offered by the framework and the multi-task training process proposed in DPA2 [paper](https://arxiv.org/abs/2312.15492),
+Additionally, within the PyTorch implementation and leveraging the flexibility offered by the framework and the multi-task training process proposed in DPA2 [paper](https://doi.org/10.1038/s41524-024-01493-2),
 we also support more general multitask pre-trained models, which includes multiple datasets for pre-training. These pre-training datasets share a common descriptor while maintaining their individual fitting nets,
 as detailed in the paper above.
 

diff --git a/doc/train/multi-task-training.md b/doc/train/multi-task-training.md
@@ -26,7 +26,7 @@ and the Adam optimizer is executed to minimize $L^{(t)}$ for one step to update
 In the case of multi-GPU parallel training, different GPUs will independently select their tasks.
 In the DPA-2 model, this multi-task training framework is adopted.[^1]
 
-[^1]: Duo Zhang, Xinzijian Liu, Xiangyu Zhang, Chengqian Zhang, Chun Cai, Hangrui Bi, Yiming Du, Xuejian Qin, Jiameng Huang, Bowen Li, Yifan Shan, Jinzhe Zeng, Yuzhi Zhang, Siyuan Liu, Yifan Li, Junhan Chang, Xinyan Wang, Shuo Zhou, Jianchuan Liu, Xiaoshan Luo, Zhenyu Wang, Wanrun Jiang, Jing Wu, Yudi Yang, Jiyuan Yang, Manyi Yang, Fu-Qiang Gong, Linshuang Zhang, Mengchao Shi, Fu-Zhi Dai, Darrin M. York, Shi Liu, Tong Zhu, Zhicheng Zhong, Jian Lv, Jun Cheng, Weile Jia, Mohan Chen, Guolin Ke, Weinan E, Linfeng Zhang, Han Wang, [arXiv preprint arXiv:2312.15492 (2023)](https://arxiv.org/abs/2312.15492) licensed under a [Creative Commons Attribution (CC BY) license](http://creativecommons.org/licenses/by/4.0/).
+[^1]: Duo Zhang, Xinzijian Liu, Xiangyu Zhang, Chengqian Zhang, Chun Cai, Hangrui Bi, Yiming Du, Xuejian Qin, Anyang Peng, Jiameng Huang, Bowen Li, Yifan Shan, Jinzhe Zeng, Yuzhi Zhang, Siyuan Liu, Yifan Li, Junhan Chang, Xinyan Wang, Shuo Zhou, Jianchuan Liu, Xiaoshan Luo, Zhenyu Wang, Wanrun Jiang, Jing Wu, Yudi Yang, Jiyuan Yang, Manyi Yang, Fu-Qiang Gong, Linshuang Zhang, Mengchao Shi, Fu-Zhi Dai, Darrin M. York, Shi Liu, Tong Zhu, Zhicheng Zhong, Jian Lv, Jun Cheng, Weile Jia, Mohan Chen, Guolin Ke, Weinan E, Linfeng Zhang, Han Wang, DPA-2: a large atomic model as a multi-task learner. npj Comput Mater 10, 293 (2024). [DOI: 10.1038/s41524-024-01493-2](https://doi.org/10.1038/s41524-024-01493-2) licensed under a [Creative Commons Attribution (CC BY) license](http://creativecommons.org/licenses/by/4.0/).
 
 Compared with the previous TensorFlow implementation, the new support in PyTorch is more flexible and efficient.
 In particular, it makes multi-GPU parallel training and even tasks beyond DFT possible,