Published April 2, 2025 | Version v1
Dataset Open

Database of scalable training of neural network potentials for complex interfaces through data augmentation

  • 1. Department of Chemical Engineering, Columbia University, New York, NY, USA
  • 2. Columbia Center for Computational Electrochemistry, Columbia University, New York, NY, USA
  • 3. Columbia Electrochemical Energy Center, Columbia University, New York, NY, USA
  • 4. Debye Institute for Nanomaterials Science, Utrecht University, 3584 CS Utrecht, The Netherlands

* Contact person

Description

This database contains the reference data used for direct force training of Artificial Neural Network (ANN) interatomic potentials using the atomic energy network (ænet) and ænet-PyTorch packages (https://github.com/atomisticnet/aenet-PyTorch). It also includes the GPR-augmented data used for indirect force training via Gaussian Process Regression (GPR) surrogate models using the ænet-GPR package (https://github.com/atomisticnet/aenet-gpr). Each data file contains atomic structures, energies, and atomic forces in XCrySDen Structure Format (XSF). The dataset includes all reference training/test data and corresponding GPR-augmented data used in the four benchmark examples presented in the reference paper, "Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation". A hierarchy of the dataset is described in the README.txt file, and an overview of the dataset is also summarized in supplementary Table S1 of the reference paper.

Files

File preview

All files

Files (443.7 MiB)

Name Size
md5:136b90bd17564c3cafffaefeed528c58
629 Bytes Preview Download
md5:0aa065a05906aabb4a8d5c1fd992e50a
12.4 KiB Download
md5:12c48fdeabada0f3483e296397c811ba
152.8 MiB Download
md5:6a1ce0f80ef1bf7899eb2a831656d8b4
128.3 MiB Download
md5:05f3542df4fa980c62dad03f01456a23
147.9 MiB Download
md5:02fd7bde8f16958144845121ba163d86
14.7 MiB Preview Download

References

Preprint (Preprint where the data is discussed)
IW Yeu, A Stuke, J López-Zorrilla, JM Stevenson, DR Reichman, RA Friesner, A Urban, N Artrith, arXiv:2412.05773, doi: 10.48550/arXiv.2412.05773

Journal reference
Yeu, I.W., Stuke, A., López-Zorrilla, J. et al. Scalable training of neural network potentials for complex interfaces through data augmentation. npj Comput Mater 11, 156 (2025)

Software
https://github.com/atomisticnet/aenet-gpr