CA-9, a dataset of carbon allotropes for training and testing of neural network potentials
JSON Export
{
"created": "2020-11-10T04:33:32.471097+00:00",
"revision": 9,
"metadata": {
"doi": "10.24435/materialscloud:6h-yj",
"references": [
{
"doi": "10.1016/j.cartre.2021.100027",
"type": "Journal reference",
"url": "https://doi.org/10.1016/j.cartre.2021.100027",
"citation": "D. Hedman, T. Rothe, G. Johansson, F. Sandin, J. A. Larsson and Y. Miyamoto, Carbon Trends 3, 100027 (2021)"
}
],
"_oai": {
"id": "oai:materialscloud.org:637"
},
"keywords": [
"CA-9",
"Dataset",
"Machine learning",
"Interatomic potential",
"Carbon",
"Neural network potential"
],
"is_last": true,
"publication_date": "Nov 11, 2020, 13:39:58",
"owner": 245,
"license_addendum": null,
"contributors": [
{
"givennames": "Daniel",
"email": "daniel.hedman@ltu.se",
"familyname": "Hedman",
"affiliations": [
"Research Center for Computational Design of Advanced Functional Materials, National Institute of Advanced Industrial Science and Technology (AIST), Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan",
"Applied Physics, Division of Materials Science, Department of Engineering Sciences and Mathematics, Lule\u00e5 University of Technology, SE-971 87 Lule\u00e5, Sweden"
]
},
{
"givennames": "Tom",
"email": "tom.rothe@s2015.tu-chemnitz.de",
"familyname": "Rothe",
"affiliations": [
"Institute of Physics, Faculty of Natural Sciences, Chemnitz University of Technology, 09126 Chemnitz, Germany"
]
},
{
"givennames": "Gustav",
"email": "gustav.johansson@ltu.se",
"familyname": "Johansson",
"affiliations": [
"Applied Physics, Division of Materials Science, Department of Engineering Sciences and Mathematics, Lule\u00e5 University of Technology, SE-971 87 Lule\u00e5, Sweden"
]
},
{
"givennames": "Fredrik",
"email": "fredrik.sandin@ltu.se",
"familyname": "Sandin",
"affiliations": [
"Machine Learning, Embedded Intelligent Systems Lab, Department of Computer Science, Electrical and Space Engineering, Lule\u00e5 University of Technology, SE-971 87 Lule\u00e5, Sweden"
]
},
{
"givennames": "J. Andreas",
"email": "andreas.1.larsson@ltu.se",
"familyname": "Larsson",
"affiliations": [
"Applied Physics, Division of Materials Science, Department of Engineering Sciences and Mathematics, Lule\u00e5 University of Technology, SE-971 87 Lule\u00e5, Sweden"
]
},
{
"givennames": "Yoshiyuki",
"email": "yoshi-miyamoto@aist.go.jp",
"familyname": "Miyamoto",
"affiliations": [
"Research Center for Computational Design of Advanced Functional Materials, National Institute of Advanced Industrial Science and Technology (AIST), Central 2, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan"
]
}
],
"description": "The use of machine learning to accelerate computer simulations is on the rise. In atomistic simulations, the use of machine learning interatomic potentials (ML-IAPs) can significantly reduce computational costs while maintaining accuracy close to that of ab initio methods. To achieve this, ML-IAPs are trained on large datasets of images, meaning atomistic configurations labeled with data from ab initio calculations. Focusing on carbon, we have created a dataset, CA-9, consisting of 48000 images labeled with energies, forces and stress tensors obtained via ab initio molecular dynamics (AIMD). We use deep learning to train state-of-the-art neural network potentials (NNPs), a form of ML-IAP, on the CA-9 dataset and investigate how training and validation data can affect the performance of the NNPs. Our results show that image generation with AIMD causes a high degree of similarity between the generated images, which has a detrimental effect on the NNPs. However, by carefully choosing which images from the dataset are included in the training and validation data, this effect can be mitigated. We end by benchmarking our trained NNPs in real-world applications and show we can reproduce results from ab initio calculations with an accuracy higher than previously published ML- or classic IAPs.",
"title": "CA-9, a dataset of carbon allotropes for training and testing of neural network potentials",
"edited_by": 245,
"license": "Creative Commons Attribution 4.0 International",
"id": "637",
"_files": [
{
"key": "Readme.txt",
"description": "Readme file",
"size": 2428,
"checksum": "md5:ef25015cf52d749a74d34aa4580c61be"
},
{
"key": "scripts.zip",
"description": "Python scripts used to read data from VASP and train neural network potentials",
"size": 3500,
"checksum": "md5:47504db1933a414ff4277da31c05e29e"
},
{
"key": "datasets.zip",
"description": "Datasets for training and testing of neural network potentials",
"size": 475429780,
"checksum": "md5:a3739f0fd8d107195afb9d975a804f65"
},
{
"key": "NNPs.zip",
"description": "The best trained neural network potentials for each dataset",
"size": 13734349,
"checksum": "md5:efb81a67ccd0171c9009e32c86986238"
}
],
"mcid": "2020.144",
"version": 1,
"status": "published",
"conceptrecid": "636"
},
"updated": "2021-02-12T03:41:52.545078+00:00",
"id": "637"
}