Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts

doi:10.24435/materialscloud:vp-h5

materialscloud:2021.40

Published March 5, 2021 | Version v1

Dataset Open

Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts

Gallarati, Simone¹

Fabregat, Raimon¹

Laplaza, Rubén^{1, 2}

Bhattacharjee, Sinjini^{1, 3}

Wodrich, Matthew¹

Corminboeuf, Clemence^{1, 2, 4}

*

1. Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
2. National Center for Competence in Research - Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
3. Indian Institute of Science Education and Research, Dr. Homi Bhabha Rd, Ward No. 8, NCL Colony, Pashan, Pune, Maharashtra 4110008, India
4. National Center for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland

* Contact person

Hundreds of catalytic methods are developed each year to meet the demand for high-purity chiral compounds. The computational design of enantioselective organocatalysts remains a significant challenge, as catalysts are typically discovered through experimental screening. Recent advances in combining quantum chemical computations and machine learning (ML) hold great potential to propel the next leap forward in asymmetric catalysis. Within the context of quantum chemical machine learning (QML, or atomistic ML), the ML representations used to encode the structure of molecules and evaluate their similarity cannot easily capture the subtle energy differences that govern enantioselectivity. Here, we present a general strategy for improving molecular representations within an atomistic machine learning model to predict the enantiomeric excess of asymmetric propargylation organocatalysts solely from the structure of catalytic cycle intermediates. Mean absolute errors as low as 0.25 kcal mol-1 were achieved in predictions of the activation energy. This strategy opens the door for quickly and accurately predicting higher-selectivity catalysts for any reaction from available structural information.

Files

File preview

files_description.md

All files

Files (2.2 MiB)

Name	Size
files_description.md md5:5702919e6040eca7d8c95bc949297f8c	167 Bytes	Preview Download
Propargylation_ML_data.zip md5:e9de7acb257e7b092ac8e4ab9dc5ca00	2.2 MiB	Preview Download

References

Journal reference (Manuscript under consideration for publication.)
S. Gallarati, R. Fabregat, R. Laplaza, S. Bhattacharjee, M. D. Wodrich, C. Corminboeuf, Chem. Sci., under consideration.

	All versions	This version
Views	363	363
Downloads	24	24
Data volume	15.5 MiB	15.5 MiB

Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts

Creators

Description

Files

File preview

files_description.md

All files

Files (2.2 MiB)

References