×

Recommended by

Indexed by

Unified theory of atom-centered representations and message-passing machine-learning schemes

Jigyasa Nigam1,2*, Sergey Pozdnyakov1*, Guillaume Fraux1*, Michele Ceriotti1,2*

1 Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

2 National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Corresponding authors emails: jigyasa.nigam@epfl.ch, sergey.pozdnyakov@epfl.ch, guillaume.fraux@epfl.ch, michele.ceriotti@epfl.ch
DOI10.24435/materialscloud:3f-g3 [version v1]

Publication date: Mar 24, 2022

How to cite this record

Jigyasa Nigam, Sergey Pozdnyakov, Guillaume Fraux, Michele Ceriotti, Unified theory of atom-centered representations and message-passing machine-learning schemes, Materials Cloud Archive 2022.44 (2022), doi: 10.24435/materialscloud:3f-g3.

Description

Data-driven schemes that associate molecular and crystal structures with their microscopic properties share the need for a concise, effective description of the arrangement of their atomic constituents. Many types of models rely on descriptions of atom-centered environments, that are associated with an atomic property or with an atomic contribution to an extensive macroscopic quantity. Frameworks in this class can be understood in terms of atom-centered density correlations (ACDC), that are used as a basis for a body-ordered, symmetry-adapted expansion of the targets. Several other schemes, that gather information on the relationship between neighboring atoms using "message-passing" ideas, cannot be directly mapped to correlations centered around a single atom. We generalize the ACDC framework to include multi-centered information, generating representations that provide a complete linear basis to regress symmetric functions of atomic coordinates, and provides a coherent foundation to systematize our understanding of both atom-centered and message-passing, invariant and equivariant machine-learning schemes. This record contains the data and code required to reproduce the results from the corresponding paper, computing message-passing inspired machine learning features built on top of density correlation. The data used in this article is a subset of other existing datasets, which can be found online: - methane dataset: https://archive.materialscloud.org/record/2020.105 - NaCl dataset: https://github.com/dilkins/TENSOAP/tree/ea671154b3642b4ec879a4292a4dd4399ddbdea6/example/random_nacl - QM7 and QM9 with dipole moments: https://archive.materialscloud.org/record/2020.56

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name Size Description
methane-40k.xyz
MD5md5:70d7f45c49b725a8f95a3e7c763a291c
13.0 MiB Extended XYZ file containing the methane structures and energies
methane.zip
MD5md5:f8a03c12c20114ef88237f61c725560d
6.5 KiB Scripts used to compute message passing features and use them to train models for methane
random-nacl.xyz
MD5md5:9e76b4a8972a2e829335163fe31c7d7c
6.8 MiB Extended XYZ file containing the random NaCl structures and energies
nacl.zip
MD5md5:e03a06c9b6c4657590280d78797cc14d
7.5 KiB Scripts used to compute message passing features and use them to train models for NaCl
qm7_shuffled_chno.xyz
MD5md5:6b437f8c3ad640ac3676c8e43d98e965
8.3 MiB Extended XYZ file containing the subset of QM7 used in this study
qm7_chno_dipole.npy
MD5md5:a44c0931c24788625daa60ce1afea85b
161.2 KiB Dipole moments of QM7 structures in numpy format
qm9_1000_test.xyz
MD5md5:97406dd69fd0ec945fcadf432a340051
1.1 MiB Extended XYZ file containing the subset of QM9 used in this study
qm9_dipole.npy
MD5md5:c90dc2576831316f43df392bfcf91af7
23.5 KiB Dipole moments of QM9 structures in numpy format
qm7_dipole.zip
MD5md5:2966a6260def1b073e2a3c90b6e354f1
6.4 KiB Scripts used to compute message passing features and use them to train models for QM7/QM9

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Preprint (Preprint where this data is used as examples for the message passing density correlation features)

Keywords

machine learning message passing reproducibility MARVEL/DD2 PASC

Version history:

2022.44 (version v1) [This version] Mar 24, 2022 DOI10.24435/materialscloud:3f-g3