Mining the C-C Cross-Coupling Genome using Machine Learning

Boodsarin Sawatlon¹, Alberto Fabrizio¹, Benjamin Meyer¹, Stefan N. Heinen², Matthew D. Wodrich³, O. Anatole von Lilienfeld², Clémence Corminboeuf^1*

1 Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering (ISIC), École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, (Switzerland) and National Center for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, (Switzerland)

2 Institute of Physical Chemistry, Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, (Switzerland) and National Center for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, (Switzerland)

3 Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering (ISIC), École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, (Switzerland)

* Corresponding authors emails: clemence.corminboeuf@epfl.ch

DOI10.24435/materialscloud:2019.0007/v1 [version v1]

Publication date: Feb 06, 2019

How to cite this record

Boodsarin Sawatlon, Alberto Fabrizio, Benjamin Meyer, Stefan N. Heinen, Matthew D. Wodrich, O. Anatole von Lilienfeld, Clémence Corminboeuf, Mining the C-C Cross-Coupling Genome using Machine Learning, Materials Cloud Archive 2019.0007/v1 (2019), https://doi.org/10.24435/materialscloud:2019.0007/v1

Description

Applications of machine-learning (ML) techniques to the study of catalytic processes have begun to appear in the literature with increasing frequency. The computational speed up provided by ML allows the properties and energetics of thousands of prospective catalysts to be rapidly assessed. These results, once compiled into a database containing different properties, can be mined with the goal of establishing relationships between the intrinsic chemical properties of different catalysts and their overall catalytic performance. Previously, we applied ML models to predict the performance of 18,000 prospective catalysts for a Suzuki coupling reaction using molecular volcano plots. Here, we expand on our earlier work by examining a larger section of the C-C cross-coupling genome by using a dimensionality-reducing data-clustering algorithms (i.e., sketch-map) to, first, identify the compatibility of each catalyst with different C-C cross-coupling variants (e.g., Suzuki, Kumada, Negishi, Stille, and/or Hiyama) and, second, to uncover links between the chemical property of a catalyst and its catalytic activity. Our findings, based on the analysis of 18,000 catalysts, reveal strong correlations between a catalyst’s HOMO energy and the suitability of its thermodynamic profile. These values can, subsequently, be tuned in order to maximize the thermodynamics of the catalytic cycle through the judicious choice of metal centers and the π-accepting/σ-donating nature of the flanking ligands. Overall, group 10 metals (Ni, Pd, Pt) are best coupled with the strong π-acceptor ligands and group 11 metals (Cu, Ag, Au) with weak π-acceptors, which maximize the thermodynamic drive of the catalytic cycle.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name	Size	Description
structures_all.tar.gz MD5md5:030cd6a0e4fc77b0974e9ceb33fe8ce8	30.9 MiB	The overall 25,116 generated structures of each catalytic intermediates.
properties.tar.gz MD5md5:027f9ef8184c5d8cc94c0b4a3d64319b	992.8 KiB	Properties of all structures in CSV format.
StructureofLigands_0-90.pdf MD5md5:882ec89f96f17a275ce56485a1419990	420.8 KiB	Chemical structures of 91 ligands in database.

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Journal reference

B. Sawatlon, A. Fabrizio, B. Meyer, S. N. Heinen, M. D. Wodrich, O. A. von Lilienfeld and C. Corminboeuf. Mining the C-C Cross-Coupling Genome using Machine Learning, Submitted

Keywords

machine learning homogeneous catalysis volcano plot transition metal complexes sketch-map

Version history:

2019.0007/v3 (version v3)	Feb 23, 2019	DOI10.24435/materialscloud:2019.0007/v3
2019.0007/v2 (version v2)	Feb 19, 2019	DOI10.24435/materialscloud:2019.0007/v2
2019.0007/v1 (version v1) [This version]	Feb 06, 2019	DOI10.24435/materialscloud:2019.0007/v1

Recommended by

Indexed by

materialscloud:2019.0007/v1