×

Recommended by

Indexed by

Database of scalable training of neural network potentials for complex interfaces through data augmentation

In Won Yeu1,2*, Annika Stuke1,2*, Alexander Urban1,2,3*, Nongnuch Artrith2,4*

1 Department of Chemical Engineering, Columbia University, New York, NY, USA

2 Columbia Center for Computational Electrochemistry, Columbia University, New York, NY, USA

3 Columbia Electrochemical Energy Center, Columbia University, New York, NY, USA

4 Debye Institute for Nanomaterials Science, Utrecht University, 3584 CS Utrecht, The Netherlands

* Corresponding authors emails: iy2185@columbia.edu, as6394@columbia.edu, au2229@columbia.edu, n.artrith@uu.nl
DOI10.24435/materialscloud:w6-9a [version v1]

Publication date: Apr 02, 2025

How to cite this record

In Won Yeu, Annika Stuke, Alexander Urban, Nongnuch Artrith, Database of scalable training of neural network potentials for complex interfaces through data augmentation, Materials Cloud Archive 2025.51 (2025), https://doi.org/10.24435/materialscloud:w6-9a

Description

This database contains the reference data used for direct force training of Artificial Neural Network (ANN) interatomic potentials using the atomic energy network (ænet) and ænet-PyTorch packages (https://github.com/atomisticnet/aenet-PyTorch). It also includes the GPR-augmented data used for indirect force training via Gaussian Process Regression (GPR) surrogate models using the ænet-GPR package (https://github.com/atomisticnet/aenet-gpr). Each data file contains atomic structures, energies, and atomic forces in XCrySDen Structure Format (XSF). The dataset includes all reference training/test data and corresponding GPR-augmented data used in the four benchmark examples presented in the reference paper, “Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation”. A hierarchy of the dataset is described in the README.txt file, and an overview of the dataset is also summarized in supplementary Table S1 of the reference paper.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name Size Description
README.txt
MD5md5:02fd7bde8f16958144845121ba163d86
14.7 MiB Description of the dataset
1_H2_xsf.tar.bz2
MD5md5:0aa065a05906aabb4a8d5c1fd992e50a
12.4 KiB ANN training data in XSF format used for H₂ molecule example (Total number: 403)
2_EC-EC_xsf.tar.bz2
MD5md5:12c48fdeabada0f3483e296397c811ba
152.8 MiB ANN training data in XSF format used for EC dimer example (Total number: 181,000)
3_Li-EC-surface_xsf.tar.bz2
MD5md5:6a1ce0f80ef1bf7899eb2a831656d8b4
128.3 MiB ANN training data in XSF format used for EC on Li surface example (Total number: 68,000)
4_Li-EC-interfaces_xsf.tar.bz2
MD5md5:05f3542df4fa980c62dad03f01456a23
147.9 MiB ANN training data in XSF format used for Li-EC interface example (Total number: 80,768)

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

Keywords

first principles machine learning Li metal battery potential energy surface aenet data augmentation

Version history:

2025.51 (version v1) [This version] Apr 02, 2025 DOI10.24435/materialscloud:w6-9a