Recommended by

Indexed by

Equivariant representations for molecular Hamiltonians

Jigyasa Nigam1,2*, Michael J. Willatt1, Michele Ceriotti1,2*

1 Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

2 National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Corresponding authors emails: jigyasa.nigam@epfl.ch, michele.ceriotti@epfl.ch
DOI10.24435/materialscloud:2d-3e [version v1]

Publication date: Dec 09, 2021

How to cite this record

Jigyasa Nigam, Michael J. Willatt, Michele Ceriotti, Equivariant representations for molecular Hamiltonians, Materials Cloud Archive 2021.217 (2021), doi: 10.24435/materialscloud:2d-3e.


The application of machine learning to the modeling of materials and molecules has proven to be extremely successful in accelerating the understanding, design, and characterization of materials. A major factor in this success has been the development of representations of atomic structures that reflect physics-based symmetries of the underlying interactions. Most of the descriptions of atomic properties or even global observables rely on decompositions into atomic contributions that are subsequently learnt in an atom-centered framework. However, many quantities associated with quantum mechanical calculations, such as the single-particle Hamiltonian matrices written in an atomic-orbital basis, are associated with multiple atom-centers. Following the introduction of equivariant N-center structural descriptors, in the reference below, that generalize the very successful atom-centered density correlation features to the problem of learning properties indexed by N atoms, we present benchmarks showing how the construction can be applied to efficiently learn the matrix elements of the (effective) single-particle Hamiltonian in an atom-centered orbital basis. In this record, we include the dataset comprising the Fock and overlap matrices in the def2-SVP of 1000 distorted water molecules, up to 4500 ethanol structures, and a subset of QM7-CHNO molecules. We also provide scripts to generate the two-center representations and fit linear and sparse kernel models for the Hamiltonians.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
7.4 GiB Folder containing benchmarks for machine learning for datasets of water, ethanol and QM7-CHNO and commented scripts to reproduce the same.
889 Bytes Description of data source and availability of code


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Preprint (Preprint where the data and methods are described)


ERC MARVEL EPFL H2020 N-center-representations machine learning hamiltonians equivariant representations hamiltonian learning

Version history:

2021.217 (version v1) [This version] Dec 09, 2021 DOI10.24435/materialscloud:2d-3e