×

Recommended by

Indexed by

From organic fragments to photoswitchable catalysts: the off-on structural repository for transferable kernel-based potentials

Frédéric Célerse1, Matthew D. Wodrich1,2, Sergi Vela1, Simone Gallarati1, Raimon Fabregat1, Veronika Juraskova1, Clémence Corminboeuf1,2,3*

1 Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, 1015, Switzerland.

2 National Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

3 National Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Corresponding authors emails: clemence.corminboeuf@epfl.ch
DOI10.24435/materialscloud:pz-2y [version v1]

Publication date: Dec 08, 2023

How to cite this record

Frédéric Célerse, Matthew D. Wodrich, Sergi Vela, Simone Gallarati, Raimon Fabregat, Veronika Juraskova, Clémence Corminboeuf, From organic fragments to photoswitchable catalysts: the off-on structural repository for transferable kernel-based potentials, Materials Cloud Archive 2023.189 (2023), https://doi.org/10.24435/materialscloud:pz-2y

Description

Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond “simple” drug-like compounds or molecules comprised of well-defined building blocks (e.g., peptides) is challenging, as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversity. Here, we introduce the OFF–ON (Organic Fragments From Organocatalysts that are Non-modular) database, a repository of 7,869 equilibrium and 67,457 non--equilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a Local Kernel Regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF–ON dataset offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound comprised of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.

Files

File name Size Description
README.txt
MD5md5:bf61d416e2e4a81c2307c39660297e50
979 Bytes README file detailing the contents of this record.
dftb_in.hsd
MD5md5:4c5c4ce9b2b0593ace85ba951368db59
945 Bytes Input to be used with the DFTB+ software to obtain similar DFTB3/3ob energies
input_for_terachem.inp
MD5md5:05fbdafe549b8ca6649172466edc61b7
166 Bytes Input to be used with the Terachem software to obtain similar PBE0-D3/def2-svp energies
database_PSC_eqm.xyz.zip
MD5md5:0e752283047729ea60da432925a78b17
1.7 MiB Compressed zip file with all the XYZ structures at the equilibrium (7 869)
database_PSC_out_of_eqm.xyz.zip
MD5md5:de1dec75b35bc20bd685f049f95207a6
26.4 MiB Compressed zip file with all the XYZ structures out of the equilibrium (67 457)
Chemical_diversity_Chemiscope.json.gz
MD5md5:c9ba5af5ff63c665a45327f67f6ec66c
1.7 MiB Chemiscope file containing the DFT/DFTB energies and structures of the database_PSC_eqm.xyz file.
Conformational_sampling_Chemiscope.json.gz
MD5md5:8b316906e01efbde8ca0caebb9ba2179
24.7 MiB Chemiscope file containing the DFT/DFTB energies and structures of the database_PSC_out_of_eqm.xyz file.
Normalization_for_LKR.py
MD5md5:d7da11242f653087b172795dd697b921
1.2 KiB Python script to normalize DFT and DFTB energies and then to obtain delta energies to be trained with LKR-OMP

License

Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Preprint (To be submitted, will be updated soon)
F. Célerse, M.D. Wodrich, S. Vela, S. Gallarati, R. Fabregat., V. Juraskova, C. Corminboeuf, 2023

Keywords

Photoswitchable organocatalyst machine learning free energy

Version history:

2023.189 (version v1) [This version] Dec 08, 2023 DOI10.24435/materialscloud:pz-2y