Published December 8, 2023 | Version v1
Dataset Open

SPAᴴM(a,b): encoding the density information from guess Hamiltonian in quantum machine learning representations

  • 1. Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland

* Contact person

Description

Recently, we introduced a class of molecular representations for kernel-based regression methods — the spectrum of approximated Hamiltonian matrices (SPAᴴM) — that takes advantage of lightweight one-electron Hamiltonians traditionally used as an SCF initial guess. The original SPAᴴM variant is built from occupied-orbital energies (\ie, eigenvalues) and naturally contains all the information about nuclear charges, atomic positions, and symmetry requirements. Its advantages were demonstrated on datasets featuring a wide variation of charge and spin, for which traditional structure-based representations commonly fail. SPAᴴM(a,b), as introduced here, expands eigenvalue SPAᴴM into local and transferable representations. It relies upon one-electron density matrices to build fingerprints from atomic or bond density overlap contributions inspired from preceding state-of-the-art representations. The performance and efficiency of SPAᴴM(a,b) is assessed on the predictions for datasets of prototypical organic molecules (QM7) of different charges and azoheteroarene dyes in an excited state. Overall, both SPAᴴM(a) and SPAᴴM(b) outperform state-of-the-art representations on difficult prediction tasks such as the atomic properties of charged open-shell species and of π-conjugated systems.

Files

File preview

files_description.md

All files

Files (1.6 GiB)

Name Size
md5:55b8c33acd7427d02a1dd47def96e20e
435 Bytes Preview Download
md5:5d0fc2baddea0c092274eb63e18204f9
3.0 KiB Preview Download
md5:fc76968932247b7f4473696b45c6be86
1.6 GiB Download

References

Journal reference
K. R. Briling, Y. Calvino Alonso, A. Fabrizio, and C. Corminboeuf, J. Chem. Theory Comput. 20, 1108–1117 (2024), doi: 10.1021/acs.jctc.3c01040