Published December 8, 2023 | Version v1
Dataset Open

From organic fragments to photoswitchable catalysts: the off-on structural repository for transferable kernel-based potentials

  • 1. Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, 1015, Switzerland.
  • 2. National Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
  • 3. National Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Contact person

Description

Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules comprised of well-defined building blocks (e.g., peptides) is challenging, as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversity. Here, we introduce the OFF–ON (Organic Fragments From Organocatalysts that are Non-modular) database, a repository of 7,869 equilibrium and 67,457 non--equilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a Local Kernel Regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF–ON dataset offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound comprised of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.

Files

File preview

files_description.md

All files

Files (54.4 MiB)

Name Size
md5:fc85f2b414441a84de95aaea98028242
1.0 KiB Preview Download
md5:c9ba5af5ff63c665a45327f67f6ec66c
1.7 MiB Download
md5:8b316906e01efbde8ca0caebb9ba2179
24.7 MiB Download
md5:0e752283047729ea60da432925a78b17
1.7 MiB Preview Download
md5:de1dec75b35bc20bd685f049f95207a6
26.4 MiB Preview Download
md5:4c5c4ce9b2b0593ace85ba951368db59
945 Bytes Download
md5:05fbdafe549b8ca6649172466edc61b7
166 Bytes Download
md5:d7da11242f653087b172795dd697b921
1.2 KiB Download
md5:bf61d416e2e4a81c2307c39660297e50
979 Bytes Preview Download

References

Preprint (To be submitted, will be updated soon)
F. Célerse, M.D. Wodrich, S. Vela, S. Gallarati, R. Fabregat., V. Juraskova, C. Corminboeuf, 2023