Published July 20, 2021 | Version v1
Dataset Open

Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks

  • 1. Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States

* Contact person

Description

Neural network (NN) force fields can predict potential energy surfaces with high accuracy and speed compared to electronic structure methods typically used to generate their training data. However, NN predictions are well-defined only for points close to the training domains, and may exhibit poor results during extrapolation. Uncertainty quantification methods can detect geometries for which predicted errors are high, but sampling regions of high uncertainty requires a thorough exploration of the phase space, often using expensive simulations. Our work uses automatic differentiation to sample atomistic configurations by balancing thermodynamic accessibility and uncertainty quantification without using molecular dynamics simulations. This dataset provides the atomistic data used to train the NN potentials for the ammonia, alanine dipeptide, and zeolite-molecule systems. For all materials, geometries, energies, and forces are provided. The ammonia and zeolite systems were computed using density functional theory calculations, while the alanine dipeptide dataset was generated using molecular dynamics simulations with the OPLS force field.

Files

File preview

files_description.md

All files

Files (268.9 MiB)

Name Size
md5:68c3070831b16f2578b6438b61ff7b40
260 Bytes Preview Download
md5:ee2369b639dd57e0351a4ae7aff4f9db
268.9 MiB Preview Download
md5:0c54d4f52f68bc84eb48880ec7dbf34a
3.6 KiB Preview Download

References

Journal reference (Paper in which the data is discussed)
D. Schwalbe-Koda, A.R. Tan, R. Gómez-Bombarelli. Nat. Commun. 12, 5104 (2021), doi: 10.1038/s41467-021-25342-8

Preprint (Preprint where the data is discussed)
D. Schwalbe-Koda, A.R. Tan, R. Gómez-Bombarelli. arXiv:2101.11588 (2021)