Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset request] Foundry: DFT Estimates of Solvation Energy in Multiple Solvents #52

Open
1 task
gpwolfe opened this issue Mar 20, 2024 · 0 comments
Open
1 task

Comments

@gpwolfe
Copy link
Collaborator

gpwolfe commented Mar 20, 2024

Name

Gregory Wolfe

Email

gw2338@nyu.edu

Dataset name

Foundry: DFT Estimates of Solvation Energy in Multiple Solvents

Authors

Ward, Logan; Dandu, Naveen; Blaiszik, Ben; Narayanan, Badri; Assary, Rajeev S.; Redfern, Paul C.; Foster, Ian; Curtiss, Larry A.

Publication link

https://doi.org/10.1021/acs.jpca.1c01960

Data link

10.18126/jos5-wj65

Additional links

No response

Dataset description

The solvation properties of molecules, often estimated using quantum chemical simulations, are important in the synthesis of energy storage materials, drugs, and industrial chemicals. Here, we develop machine learning models of solvation energies to replace expensive quantum chemistry calculations with inexpensive-to-compute message-passing neural network models that require only the molecular graph as inputs. Our models are trained on a new database of solvation energies for 130,258 molecules taken from the QM9 dataset computed in five solvents (acetone, ethanol, acetonitrile, dimethyl sulfoxide, and water) via an implicit solvent model. Our best model achieves a mean absolute error of 0.5 kcal/mol for molecules with nine or fewer non-hydrogen atoms and 1 kcal/mol for molecules with between 10 and 14 non-hydrogen atoms. We make the entire dataset of 651,290 computed entries openly available and provide simple web and programmatic interfaces to enable others to run our solvation energy model on new molecules. This model calculates the solvation energies for molecules using only the SMILES string and also provides an estimate of whether each molecule is within the domain of applicability of our model. We envision that the dataset and models will provide the functionality needed for the rapid screening of large chemical spaces to discover improved molecules for many applications.

File details

Number of configurations: 130258

Method

DFT

Method (other)

No response

Software

None

Software (other)

No response

Software version(s)

No response

Additional details

No response

Property types

Band gap, Free energy, Potential energy

Other/additional property

No response

Property details

No response

Elements

No response

Number of Configurations

No response

Naming convention

No response

Configuration sets

No response

Configuration labels

No response

Distribution license

No response

Permissions

  • I confirm that I have the necessary permissions to submit this dataset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant