Update paper.md

sparks-baird · Mar 19, 2024 · a4a8454 · a4a8454
1 parent 7a4c709
commit a4a8454
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/reports/paper.md b/reports/paper.md
@@ -54,7 +54,7 @@ The progress of a machine learning field is both tracked and propelled through t
 
 In the field of materials informatics, where materials science intersects with machine learning, benchmarks play a crucial role in assessing model performance and enabling fair comparisons among various tools and models. Typically, these benchmarks focus on evaluating the accuracy of predictive models for materials properties, utilizing well-established metrics such as mean absolute error (MAE) and root-mean-square error (RMSE) to measure performance against actual measurements. A standard practice involves splitting the data into two parts, with one serving as training data for model development and the other as test data for assessing performance [@dunn_benchmarking_2020].
 
-However, benchmarking generative models, which aim to create entirely new data rather than focusing solely on predictive accuracy, presents unique challenges. While significant progress has been made in standardizing benchmarks for tasks like image generation and molecule synthesis, the field of crystal structure generative modeling lacks this level of standardization (this is separate from machine learning interatomic potentials, which have the robust [`matbench-discovery`](https://matbench-discovery.materialsproject.org/) [@riebesell_matbench_2024] and [Jarvis Leaderboard](https://pages.nist.gov/jarvis_leaderboard/) benchmarking frameworks [@choudhary_large_2023]). Molecular generative modeling benefits from widely adopted benchmark platforms such as Guacamol [@brown_guacamol_2019] and Moses [@polykovskiy_molecular_2020], which offer easy installation, usage guidelines, and leaderboards for tracking progress. In contrast, existing evaluations in crystal structure generative modeling, as seen in CDVAE [@xie_crystal_2022], FTCP [@ren_invertible_2022], PGCGM [@zhao_physics_2022], CubicGAN [@zhao_high-throughput_2021], and CrysTens [@alverson_generative_2022], lack standardization, pose challenges in terms of installation and application to new models and datasets, and lack publicly accessible leaderboards. While these evaluations are valuable within their respective scopes, there is a clear need for a dedicated benchmarking platform to promote standardization and facilitate robust comparisons.
+However, benchmarking generative models, which aim to create entirely new data rather than focusing solely on predictive accuracy, presents unique challenges. While significant progress has been made in standardizing benchmarks for tasks like image generation and molecule synthesis, the field of crystal structure generative modeling lacks this level of standardization (this is separate from machine learning interatomic potentials, which have the robust and comprehensive [`matbench-discovery`](https://matbench-discovery.materialsproject.org/) [@riebesell_matbench_2024] and [Jarvis Leaderboard](https://pages.nist.gov/jarvis_leaderboard/) benchmarking frameworks [@choudhary_large_2023]). Molecular generative modeling benefits from widely adopted benchmark platforms such as Guacamol [@brown_guacamol_2019] and Moses [@polykovskiy_molecular_2020], which offer easy installation, usage guidelines, and leaderboards for tracking progress. In contrast, existing evaluations in crystal structure generative modeling, as seen in CDVAE [@xie_crystal_2022], FTCP [@ren_invertible_2022], PGCGM [@zhao_physics_2022], CubicGAN [@zhao_high-throughput_2021], and CrysTens [@alverson_generative_2022], lack standardization, pose challenges in terms of installation and application to new models and datasets, and lack publicly accessible leaderboards. While these evaluations are valuable within their respective scopes, there is a clear need for a dedicated benchmarking platform to promote standardization and facilitate robust comparisons.
 
 In this work, we introduce `matbench-genmetrics`, a materials benchmarking platform for crystal structure generative models. We use concepts from molecular generative modeling benchmarking to create a set of evaluation metrics---validity, coverage, novelty, and uniqueness---which are broadly defined as follows: