Machine learning meets volcano plots: Computational discovery of cross-coupling catalysts

Benjamin Meyer1*, Boodsarin Sawatlon1*, Stefan Niklaus Heinen2*, O. Anatole von Lilienfeld2*, Clémence Corminboeuf1*

1 Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland;

2 Institute of Physical Chemistry, Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland

* Corresponding authors emails: , , , ,
DOI10.24435/materialscloud:2018.0014/v1 [version v1]

Publication date: Aug 01, 2018

How to cite this record

Benjamin Meyer, Boodsarin Sawatlon, Stefan Niklaus Heinen, O. Anatole von Lilienfeld, Clémence Corminboeuf, Machine learning meets volcano plots: Computational discovery of cross-coupling catalysts, Materials Cloud Archive 2018.0014/v1 (2018), doi: 10.24435/materialscloud:2018.0014/v1.


The application of modern machine learning to challenges in atomistic simulation is gaining attraction. We present new machine learning models that can predict the energy of the oxidative addition process between a transition metal complex and a substrate for C-C cross-coupling reaction. In turn, this quantity can be used as a descriptor to estimate the activity of homogeneous catalysts using molecular volcano plots. The versatility of this approach is illustrated for vast libraries of organometallic catalysts based on Pt, Pd, Ni, Cu, Ag, and Au combined with 91 ligands. Out-of-sample machine learning predictions were made on a total of 18,062 compounds leading to 557 catalyst candidates falling into the ideal thermodynamic window. This number was further refined by searching for candidates with an estimated price lower than 10 US$/mmol. The 37 catalyst finalists are dominated by palladium phosphine ligand combinations but also include earth abundant (Cu) transition metal with less common ligands. Our results indicate that modern statistical learning techniques can be applied to the computational discovery of readily available and promising catalyst candidates.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
30.9 MiB The overall 25,116 generated structures of each catalytic intermediates.
10.6 MiB The overall 7,054 optimized geometries at the B3LYP-D3/3-21G level of each catalytic intermediates.
782.4 KiB The single point energies computed at the B3LYP-D3/def2-TZVP level, the corresponding binding energies and the 18,062 out-of-sample machine learning predicted binding energies using the Coulomb Matrix, Bag of Bonds and SLTAM representations


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.

External references

Journal reference
B. Meyer, B. Sawatlon, S. N. Heinen, O. A. von Lilienfeld and C. Corminboeuf. Machine learning meets volcano plots: Computational discovery of cross-coupling catalysts, Accepted. doi:10.1039/C8SC01949E


machine learning homogeneous catalysis volcano plot transition metal complexes

Version history:

2018.0014/v1 (version v1) [This version] Aug 01, 2018 DOI10.24435/materialscloud:2018.0014/v1