×

Indexed by

Structure-property maps with kernel principal covariates regression

Benjamin A. Helfrecht1*, Rose K. Cersonsky1*, Guillaume Fraux1*, Michele Ceriotti1*

1 Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Corresponding authors emails: benjamin.helfrecht@epfl.ch, rose.cersonsky@epfl.ch, guillaume.fraux@epfl.ch, michele.ceriotti@epfl.ch
DOI10.24435/materialscloud:ay-eq [version v1]

Publication date: Jul 16, 2020

How to cite this record

Benjamin A. Helfrecht, Rose K. Cersonsky, Guillaume Fraux, Michele Ceriotti, Structure-property maps with kernel principal covariates regression, Materials Cloud Archive 2020.80 (2020), doi: 10.24435/materialscloud:ay-eq.

Description

Data analyses based on linear methods constitute the simplest, most robust, and transparent approaches to the automatic processing of large amounts of data for building supervised or unsupervised machine learning models. Principal covariates regression (PCovR) is an underappreciated method that interpolates between principal component analysis and linear regression, and can be used to conveniently reveal structure-property relations in terms of simple-to-interpret, low-dimensional maps. Here we introduce a kernelized version of PCovR and a sparsified extension, and demonstrate the performance of this approach in revealing and predicting structure-property relations in chemistry and materials science, showing a variety of examples including elemental carbon, porous silicate frameworks, organic molecules, amino acid conformers, and molecular materials.

Files

File name Size Description
datasets.tgz
MD5md5:fd7f42bcd62917a994115b7dac03dbf9
102.7 MiB Gzipped TAR archive containing all the datasets used in XYZ format
arginine-kpcovr-0.55.json.gz
MD5md5:4901d18f01498450fddf70d4f1bd0d9e
1.1 MiB Map created with KPCovR for the Arginine-Dipeptide dataset at alpha=0.55 using the chemiscope.org visualizer JSON format
azaphenacenes-kpcovr-0.65.json.gz
MD5md5:8a5d0f6f04c6c26a7c3ce9b3c0668d80
280.9 KiB Map created with KPCovR for the Azaphenacenes dataset at alpha=0.65 using the chemiscope.org visualizer JSON format
C-VII-kpcovr-0.0.json.gz
MD5md5:500809d4a4a62b864c1dd42f2c01732c
1.6 MiB Map created with KPCovR for the AIRSS carbon dataset at alpha=0.0 using the chemiscope.org visualizer JSON format
C-VII-kpcovr-0.5.json.gz
MD5md5:1af1bd5df08b2cb21aecdd9885829a51
1.6 MiB Map created with KPCovR for the AIRSS carbon dataset at alpha=0.5 using the chemiscope.org visualizer JSON format
C-VII-kpcovr-1.0.json.gz
MD5md5:427aa3e3a939fee3a976fa475092e4bb
1.6 MiB Map created with KPCovR for the AIRSS carbon dataset at alpha=1.0 using the chemiscope.org visualizer JSON format
CSD-1000R-kpcovr-0.5.json.gz
MD5md5:709118a8c4ec0460efda82059b0b57a0
1.0 MiB Map created with KPCovR for the NMR Chemical shielding dataset at alpha=0.5 using the chemiscope.org visualizer JSON format
DEEM-global-kpcovr-0.5.json.gz
MD5md5:433087121bd75a693da1c51bdd91a519
3.0 MiB Map created with KPCovR for global properties of DEEM zeolites at alpha=0.5 using the chemiscope.org visualizer JSON format
DEEM-local-kpcovr-0.5.json.gz
MD5md5:196238f8be2815f22257fe791eaa2199
753.9 KiB Map created with KPCovR for local properties of DEEM zeolites at alpha=0.5 using the chemiscope.org visualizer JSON format
qm9-12PC-kpcovr-0.5.json.gz
MD5md5:9000a600226b8bb361eccf89b88e1613
3.3 MiB Map created with KPCovR for the QM9 dataset at alpha=0.5 using the chemiscope.org visualizer JSON format
qm9-12PC-kpcovr-1.0.json.gz
MD5md5:604a61a5e9993b2ba7b59b04a1f6306f
3.3 MiB Map created with KPCovR for the QM9 dataset at alpha=1.0 using the chemiscope.org visualizer JSON format

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

Keywords

machine learning materials science dimensionality reduction kernel methods MaX SNSF ERC EPFL MARVEL/DD1

Version history:

2020.80 (version v1) [This version] Jul 16, 2020 DOI10.24435/materialscloud:ay-eq