Publication date: Dec 08, 2023
Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond “simple” drug-like compounds or molecules comprised of well-defined building blocks (e.g., peptides) is challenging, as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversity. Here, we introduce the OFF–ON (Organic Fragments From Organocatalysts that are Non-modular) database, a repository of 7,869 equilibrium and 67,457 non--equilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a Local Kernel Regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF–ON dataset offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound comprised of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.
No Explore or Discover sections associated with this archive record.
File name | Size | Description |
---|---|---|
README.txt
MD5md5:bf61d416e2e4a81c2307c39660297e50
|
979 Bytes | README file detailing the contents of this record. |
dftb_in.hsd
MD5md5:4c5c4ce9b2b0593ace85ba951368db59
|
945 Bytes | Input to be used with the DFTB+ software to obtain similar DFTB3/3ob energies |
input_for_terachem.inp
MD5md5:05fbdafe549b8ca6649172466edc61b7
|
166 Bytes | Input to be used with the Terachem software to obtain similar PBE0-D3/def2-svp energies |
database_PSC_eqm.xyz.zip
MD5md5:0e752283047729ea60da432925a78b17
|
1.7 MiB | Compressed zip file with all the XYZ structures at the equilibrium (7 869) |
database_PSC_out_of_eqm.xyz.zip
MD5md5:de1dec75b35bc20bd685f049f95207a6
|
26.4 MiB | Compressed zip file with all the XYZ structures out of the equilibrium (67 457) |
Chemical_diversity_Chemiscope.json.gz
MD5md5:c9ba5af5ff63c665a45327f67f6ec66c
|
1.7 MiB | Chemiscope file containing the DFT/DFTB energies and structures of the database_PSC_eqm.xyz file. |
Conformational_sampling_Chemiscope.json.gz
MD5md5:8b316906e01efbde8ca0caebb9ba2179
|
24.7 MiB | Chemiscope file containing the DFT/DFTB energies and structures of the database_PSC_out_of_eqm.xyz file. |
Normalization_for_LKR.py
MD5md5:d7da11242f653087b172795dd697b921
|
1.2 KiB | Python script to normalize DFT and DFTB energies and then to obtain delta energies to be trained with LKR-OMP |
2023.189 (version v1) [This version] | Dec 08, 2023 | DOI10.24435/materialscloud:pz-2y |