Unified theory of atom-centered representations and message-passing machine-learning schemes


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Nigam, Jigyasa</dc:creator>
  <dc:creator>Pozdnyakov, Sergey</dc:creator>
  <dc:creator>Fraux, Guillaume</dc:creator>
  <dc:creator>Ceriotti, Michele</dc:creator>
  <dc:date>2022-03-24</dc:date>
  <dc:description>Data-driven schemes that associate molecular and crystal structures with their microscopic properties share the need for a concise, effective description of the arrangement of their atomic constituents. Many types of models rely on descriptions of atom-centered environments, that are associated with an atomic property or with an atomic contribution to an extensive macroscopic quantity. Frameworks in this class can be understood in terms of atom-centered density correlations (ACDC), that are used as a basis for a body-ordered, symmetry-adapted expansion of the targets. Several other schemes, that gather information on the relationship between neighboring atoms using "message-passing" ideas, cannot be directly mapped to correlations centered around a single atom. We generalize the ACDC framework to include multi-centered information, generating representations that provide a complete linear basis to regress symmetric functions of atomic coordinates, and provides a coherent foundation to systematize our understanding of both atom-centered and message-passing, invariant and equivariant machine-learning schemes.

This record contains the data and code required to reproduce the results from the corresponding paper, computing message-passing inspired machine learning features built on top of density correlation. The data used in this article is a subset of other existing datasets, which can be found online:

- methane dataset: https://archive.materialscloud.org/record/2020.105
- NaCl dataset: https://github.com/dilkins/TENSOAP/tree/ea671154b3642b4ec879a4292a4dd4399ddbdea6/example/random_nacl
- QM7 and QM9 with dipole moments: https://archive.materialscloud.org/record/2020.56</dc:description>
  <dc:identifier>https://archive.materialscloud.org/record/2022.44</dc:identifier>
  <dc:identifier>doi:10.24435/materialscloud:3f-g3</dc:identifier>
  <dc:identifier>mcid:2022.44</dc:identifier>
  <dc:identifier>oai:materialscloud.org:1294</dc:identifier>
  <dc:language>en</dc:language>
  <dc:publisher>Materials Cloud</dc:publisher>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>machine learning</dc:subject>
  <dc:subject>message passing</dc:subject>
  <dc:subject>reproducibility</dc:subject>
  <dc:subject>MARVEL/DD2</dc:subject>
  <dc:subject>PASC</dc:subject>
  <dc:title>Unified theory of atom-centered representations and message-passing machine-learning schemes</dc:title>
  <dc:type>Dataset</dc:type>
</oai_dc:dc>