×

Recommended by

Indexed by

Reaction-agnostic featurization of bidentate ligands for Bayesian ridge regression of enantioselectivity

Alexandre A. Schoepfer1,2,3, Ruben Laplaza1,3, Matthew D. Wodrich1,3, Jerome Waser2,3*, Clemence Corminboeuf1,3*

1 Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland

2 Laboratory of Catalysis and Organic Synthesis, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland

3 National Center for Competence in Research – Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland

* Corresponding authors emails: jerome.waser@epfl.ch, clemence.corminboeuf@epfl.ch
DOI10.24435/materialscloud:1m-gv [version v1]

Publication date: Jul 07, 2023

How to cite this record

Alexandre A. Schoepfer, Ruben Laplaza, Matthew D. Wodrich, Jerome Waser, Clemence Corminboeuf, Reaction-agnostic featurization of bidentate ligands for Bayesian ridge regression of enantioselectivity, Materials Cloud Archive 2023.107 (2023), https://doi.org/10.24435/materialscloud:1m-gv

Description

Chiral ligands are important components in asymmetric homogeneous catalysis, but their synthesis and screening can be both time-consuming and resource-intensive. Data-driven approaches, in contrast to screening procedures based on intuition, have the potential to reduce the time and resources needed for reaction optimization by more rapidly identifying an ideal catalyst. These approaches, however, are often non-transferable and cannot be applied across different reactions. To overcome this drawback, we introduce a general featurization strategy for bidentate ligands that is coupled with an automated feature selection pipeline and Bayesian ridge regression to perform multivariate linear regression modeling. This approach, which is applicable to any reaction, incorporates electronic, steric, and topological features (rigidity/flexibility, branching, geometry, constitution) and is well-suited for early-stage ligand optimization. Using only a limited number of points per dataset, our workflow capably predicts the enantioselectivity of four metal-catalyzed asymmetric reactions. Uncertainty estimates provided by Bayesian ridge regression permit the use of Bayesian optimization to efficiently explore pools of prospective new ligands. Using this procedure, a new library of 312 chiral bidentate ligands was screened to identify promising ligand candidates for a challenging asymmetric oxy-alkynylation reaction.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name Size Description
chemiscopify.ipynb
MD5md5:bc80ba6e3a0c6dc7bc163641436be045
19.5 KiB Notebook to generate Chemiscope files
lit_xyz.tar.gz
MD5md5:3049e2c0f2d6ada0f6deb7a94cdfcd60
93.3 KiB xyz literature ligand structures
csd_xyz.tar.gz
MD5md5:6256d527c7b0a5d379cc93b27e51c07e
217.3 KiB xyz CSD ligand structures
mc_lit.csv
MD5md5:61a8c13d3b8b713ab2ffd39d26ac7285
402.0 KiB literature ligand features
mc_csd.csv
MD5md5:60e4a156d4248d2cd446ac72ee483324
1.2 MiB CSD ligand features
lit_ligs-chemiscope.json.gz
MD5md5:60b33dab6d34ab68a6ca8cb54d6c6dce
Visualize on Chemiscope
262.0 KiB literature ligand chemiscope JSON
csd_ligs-chemiscope.json.gz
MD5md5:303a4c57f0ff537bc1088866733c72a7
Visualize on Chemiscope
731.7 KiB CSD ligand chemiscope JSON
README.md
MD5md5:86b1ace8149bb6a6b0ded8a7d936223c
594 Bytes Read me

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

Keywords

catalysis homogenous catalysis ligands bidentate ligands NCCR Catalysis EPFL

Version history:

2023.107 (version v1) [This version] Jul 07, 2023 DOI10.24435/materialscloud:1m-gv