There is a newer version of the record available.

Published July 16, 2020 | Version v1
Discover Dataset Open

Structure-property maps with kernel principal covariates regression

  • 1. Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Contact person

Description

Data analyses based on linear methods constitute the simplest, most robust, and transparent approaches to the automatic processing of large amounts of data for building supervised or unsupervised machine learning models. Principal covariates regression (PCovR) is an underappreciated method that interpolates between principal component analysis and linear regression, and can be used to conveniently reveal structure-property relations in terms of simple-to-interpret, low-dimensional maps. Here we introduce a kernelized version of PCovR and a sparsified extension, and demonstrate the performance of this approach in revealing and predicting structure-property relations in chemistry and materials science, showing a variety of examples including elemental carbon, porous silicate frameworks, organic molecules, amino acid conformers, and molecular materials.

Files

File preview

files_description.md

All files

Files (120.0 MiB)

Name Size
md5:f5247d026a989f3bb04713c02de573c3
1.7 KiB Preview Download
md5:4901d18f01498450fddf70d4f1bd0d9e
1.1 MiB Download
md5:8a5d0f6f04c6c26a7c3ce9b3c0668d80
280.9 KiB Download
md5:500809d4a4a62b864c1dd42f2c01732c
1.6 MiB Download
md5:1af1bd5df08b2cb21aecdd9885829a51
1.6 MiB Download
md5:427aa3e3a939fee3a976fa475092e4bb
1.6 MiB Download
md5:709118a8c4ec0460efda82059b0b57a0
1.0 MiB Download
md5:fd7f42bcd62917a994115b7dac03dbf9
102.7 MiB Download
md5:433087121bd75a693da1c51bdd91a519
3.0 MiB Download
md5:196238f8be2815f22257fe791eaa2199
753.9 KiB Download
md5:9000a600226b8bb361eccf89b88e1613
3.3 MiB Download
md5:604a61a5e9993b2ba7b59b04a1f6306f
3.3 MiB Download

References

Preprint
B. A. Helfrecht, R. K. Cersonsky, G. Fraux, M. Ceriotti, arXiv:2002.05076 (2020)

Materials Cloud sections using these data