From c4c847495dc4fdd03c88f107cfe6edc24980a4ad Mon Sep 17 00:00:00 2001 From: "Sterling G. Baird" Date: Sat, 25 May 2024 08:57:49 -0600 Subject: [PATCH] Update paper.md with jarvis --- reports/paper.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reports/paper.md b/reports/paper.md index 27cfba2..0cd5642 100644 --- a/reports/paper.md +++ b/reports/paper.md @@ -54,7 +54,7 @@ The progress of a machine learning field is both tracked and propelled through t In the field of materials informatics, where materials science intersects with machine learning, benchmarks play a crucial role in assessing model performance and enabling fair comparisons among various tools and models. Typically, these benchmarks focus on evaluating the accuracy of predictive models for materials properties, utilizing well-established metrics such as mean absolute error and root-mean-square error to measure performance against actual measurements. A standard practice involves splitting the data into two parts, with one serving as training data for model development and the other as test data for assessing performance [@dunn_benchmarking_2020]. -However, benchmarking generative models, which aim to create entirely new data rather than focusing solely on predictive accuracy, presents unique challenges. While significant progress has been made in standardizing benchmarks for tasks like image generation and molecule synthesis, the field of crystal structure generative modeling lacks this level of standardization (this is separate from machine learning interatomic potentials, which have the robust and comprehensive [`matbench-discovery`](https://matbench-discovery.materialsproject.org/) [@riebesell_matbench_2024] and [Jarvis Leaderboard](https://pages.nist.gov/jarvis_leaderboard/) benchmarking frameworks [@choudhary_large_2023]). Molecular generative modeling benefits from widely adopted benchmark platforms such as Guacamol [@brown_guacamol_2019] and Moses [@polykovskiy_molecular_2020], which offer easy installation, usage guidelines, and leaderboards for tracking progress. In contrast, existing evaluations in crystal structure generative modeling, as seen in CDVAE [@xie_crystal_2022], FTCP [@ren_invertible_2022], PGCGM [@zhao_physics_2023], CubicGAN [@zhao_high-throughput_2021], and CrysTens [@alverson_generative_2024], lack standardization, pose challenges in terms of installation and application to new models and datasets, and lack publicly accessible leaderboards. While these evaluations are valuable within their respective scopes, there is a clear need for a dedicated benchmarking platform to promote standardization and facilitate robust comparisons. +However, benchmarking generative models, which aim to create entirely new data rather than focusing solely on predictive accuracy, presents unique challenges. While significant progress has been made in standardizing benchmarks for tasks like image generation and molecule synthesis, the field of crystal structure generative modeling lacks this level of standardization (this is separate from machine learning interatomic potentials, which have the robust and comprehensive [`matbench-discovery`](https://matbench-discovery.materialsproject.org/) [@riebesell_matbench_2024] and [Jarvis Leaderboard](https://pages.nist.gov/jarvis_leaderboard/) benchmarking frameworks [@choudhary_jarvis-leaderboard_2024]). Molecular generative modeling benefits from widely adopted benchmark platforms such as Guacamol [@brown_guacamol_2019] and Moses [@polykovskiy_molecular_2020], which offer easy installation, usage guidelines, and leaderboards for tracking progress. In contrast, existing evaluations in crystal structure generative modeling, as seen in CDVAE [@xie_crystal_2022], FTCP [@ren_invertible_2022], PGCGM [@zhao_physics_2023], CubicGAN [@zhao_high-throughput_2021], and CrysTens [@alverson_generative_2024], lack standardization, pose challenges in terms of installation and application to new models and datasets, and lack publicly accessible leaderboards. While these evaluations are valuable within their respective scopes, there is a clear need for a dedicated benchmarking platform to promote standardization and facilitate robust comparisons. In this work, we introduce `matbench-genmetrics`, a materials benchmarking platform for crystal structure generative models. We use concepts from molecular generative modeling benchmarking to create a set of evaluation metrics---validity, coverage, novelty, and uniqueness---which are broadly defined as follows: