×

Recommended by

Indexed by

Large-scale machine-learning-assisted exploration of the whole materials space

Jonathan Schmidt1, Noah Hoffmann1, Hai-Chen Wang1, Pedro Borlido2, Pedro J. M.A. Carriço2, Tiago F. T. Cerqueira2, Silvana Botti3*, Miguel A. L. Marques1*

1 Institut für Physik, Martin-Luther-Universität Halle-Wittenberg, 06120 Halle (Saale), Germany.

2 CFisUC, Department of Physics, University of Coimbra, Rua Larga, 3004-516 Coimbra, Portugal

3 Institut für Festkörpertheorie und -optik and European Theoretical Spectroscopy Facility, Friedrich-Schiller-Universität Jena, D-07743 Jena, Germany

* Corresponding authors emails: silvana.botti@uni-jena.de, miguel.marques@physik.uni-halle.de
DOI10.24435/materialscloud:m7-50 [version v1]

Publication date: Oct 04, 2022

How to cite this record

Jonathan Schmidt, Noah Hoffmann, Hai-Chen Wang, Pedro Borlido, Pedro J. M.A. Carriço, Tiago F. T. Cerqueira, Silvana Botti, Miguel A. L. Marques, Large-scale machine-learning-assisted exploration of the whole materials space, Materials Cloud Archive 2022.126 (2022), doi: 10.24435/materialscloud:m7-50.

Description

Crystal-graph attention networks have emerged recently as remarkable tools for the prediction of thermodynamic stability and materials properties from unrelaxed crystal structures. Previous networks trained on two million materials exhibited, however, strong biases originating from underrepresented chemical elements and structural prototypes in the available data. We tackled this issue computing additional data to provide better balance across both chemical and crystal-symmetry space. Crystal-graph networks trained with this new data show unprecedented generalization accuracy, and allow for reliable, accelerated exploration of the whole space of inorganic compounds. We applied this universal network to performed machine-learning assisted high-throughput materials searches including 2500 binary and ternary prototypes and spanning about 1 billion compounds. After validation using density-functional theory, we uncover in total 19512 additional materials on the convex hull of thermodynamic stability and around 150000 compounds with a distance of less than 50 meV/atom from the hull. Here we include the DCGAT-1, DCGAT-2, and DCGAT-3 datasets used in this work.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name Size Description
test_json.py
MD5md5:544902c43b476ed5c7e0c8d3ce338365
400 Bytes Example program to load the data
README.txt
MD5md5:0691ad62b6b20382a64cdaf78b702e14
4.4 KiB Detailed description
dcgat_1_000.json.bz2
MD5md5:4800e5353f3a663cae973bd3c0397c76
30.4 MiB DGCAT-1-000
dcgat_1_001.json.bz2
MD5md5:342110c4d7194f1a7b4bd5589cbb790a
30.0 MiB DGCAT-1-001
dcgat_1_002.json.bz2
MD5md5:f6161fdbcaed5d949bc41310002882f9
31.0 MiB DGCAT-1-002
dcgat_1_003.json.bz2
MD5md5:5281d8c75f2b780eae25a8df51ba80a6
33.4 MiB DGCAT-1-003
dcgat_1_004.json.bz2
MD5md5:5b6cb04653a7b55b868f34fb237717e9
29.7 MiB DGCAT-1-004
dcgat_1_005.json.bz2
MD5md5:db570fc7f685059f71288a0c99e2018a
31.5 MiB DGCAT-1-005
dcgat_1_006.json.bz2
MD5md5:3360ad70980a88ca9ffd05b4d9a3d6c4
34.9 MiB DGCAT-1-006
dcgat_1_007.json.bz2
MD5md5:1d338c01c7a8cfc53f40259869f3d382
31.5 MiB DGCAT-1-007
dcgat_1_008.json.bz2
MD5md5:3f4a55ca617741e5a4a32d861037c2b4
31.8 MiB DGCAT-1-008
dcgat_1_009.json.bz2
MD5md5:59961ff20b1b2199e8c63badc12f95b3
32.5 MiB DGCAT-1-009
dcgat_1_010.json.bz2
MD5md5:536ea5b36fca25278464ffa30d90f225
29.2 MiB DGCAT-1-010
dcgat_1_011.json.bz2
MD5md5:19e1755fa8fd4bef9e241dfee2ddb6dd
31.3 MiB DGCAT-1-011
dcgat_1_012.json.bz2
MD5md5:3a05fc9bbb90b20ea1a144ac158c7cf4
31.0 MiB DGCAT-1-012
dcgat_1_013.json.bz2
MD5md5:0357c6e0a16ab1fd56075aa013d5afe5
31.3 MiB DGCAT-1-013
dcgat_1_014.json.bz2
MD5md5:1eb85737cab59cfff23539e5dc1f426b
14.2 MiB DGCAT-1-014
dcgat_2_000.json.bz2
MD5md5:3473538c5ae6b43b82577eec5cd6a522
32.1 MiB DGCAT-2-000
dcgat_2_001.json.bz2
MD5md5:33da8855afabeba547c11401bbb1a07f
31.2 MiB DGCAT-2-001
dcgat_2_002.json.bz2
MD5md5:f3c6e3894ebe3464413f60472d3e7dfb
30.0 MiB DGCAT-2-002
dcgat_2_003.json.bz2
MD5md5:49ccc949574a5900c4e611b993b20b0d
29.3 MiB DGCAT-2-003
dcgat_3_000.json.bz2
MD5md5:b8203c162da342f69733e96dfd5f19b6
62.7 MiB DGCAT-3-000
dcgat_3_001.json.bz2
MD5md5:b3b564f3868962d3e70158c9e8f70a2e
63.5 MiB DGCAT-3-001
dcgat_3_002.json.bz2
MD5md5:4e4f25d1a0fad143b5c3b40873d476b7
63.2 MiB DGCAT-3-002
dcgat_3_003.json.bz2
MD5md5:618559c17c20717f2dae514ca953be8f
62.8 MiB DGCAT-3-003
dcgat_3_004.json.bz2
MD5md5:bff5cb86fe7dfbff5941ccf70eb0be2b
63.1 MiB DGCAT-3-004
dcgat_3_005.json.bz2
MD5md5:6060a9f68aa97b1b31a9b0054a349795
9.6 MiB DGCAT-3-005

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

Keywords

density-functional theory high-throughput crystal-graph attention networks

Version history:

2022.126 (version v1) [This version] Oct 04, 2022 DOI10.24435/materialscloud:m7-50