From organic fragments to photoswitchable catalysts: the off-on structural repository for transferable kernel-based potentials


JSON Export

{
  "id": "2011", 
  "updated": "2023-12-08T16:10:43.013362+00:00", 
  "metadata": {
    "version": 1, 
    "contributors": [
      {
        "givennames": "Fr\u00e9d\u00e9ric", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland."
        ], 
        "familyname": "C\u00e9lerse"
      }, 
      {
        "givennames": "Matthew D.", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland.", 
          "National Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
        ], 
        "familyname": "Wodrich"
      }, 
      {
        "givennames": "Sergi", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland."
        ], 
        "familyname": "Vela"
      }, 
      {
        "givennames": "Simone", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland."
        ], 
        "familyname": "Gallarati"
      }, 
      {
        "givennames": "Raimon", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland."
        ], 
        "familyname": "Fabregat"
      }, 
      {
        "givennames": "Veronika", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland."
        ], 
        "familyname": "Juraskova"
      }, 
      {
        "givennames": "Cl\u00e9mence", 
        "affiliations": [
          "Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne (EPFL), Lausanne, 1015, Switzerland.", 
          "National Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland", 
          "National Centre for Computational Design and Discovery of Novel Materials\n(MARVEL), Ecole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
        ], 
        "email": "clemence.corminboeuf@epfl.ch", 
        "familyname": "Corminboeuf"
      }
    ], 
    "title": "From organic fragments to photoswitchable catalysts: the off-on structural repository for transferable kernel-based potentials", 
    "_oai": {
      "id": "oai:materialscloud.org:2011"
    }, 
    "keywords": [
      "Photoswitchable organocatalyst", 
      "machine learning", 
      "free energy"
    ], 
    "publication_date": "Dec 08, 2023, 17:10:42", 
    "_files": [
      {
        "key": "README.txt", 
        "description": "README file detailing the contents of this record.", 
        "checksum": "md5:bf61d416e2e4a81c2307c39660297e50", 
        "size": 979
      }, 
      {
        "key": "dftb_in.hsd", 
        "description": "Input to be used with the DFTB+ software to obtain similar DFTB3/3ob energies", 
        "checksum": "md5:4c5c4ce9b2b0593ace85ba951368db59", 
        "size": 945
      }, 
      {
        "key": "input_for_terachem.inp", 
        "description": "Input to be used with the Terachem software to obtain similar PBE0-D3/def2-svp energies", 
        "checksum": "md5:05fbdafe549b8ca6649172466edc61b7", 
        "size": 166
      }, 
      {
        "key": "database_PSC_eqm.xyz.zip", 
        "description": "Compressed zip file with all the XYZ structures at the equilibrium (7 869)", 
        "checksum": "md5:0e752283047729ea60da432925a78b17", 
        "size": 1731059
      }, 
      {
        "key": "database_PSC_out_of_eqm.xyz.zip", 
        "description": "Compressed zip file with all the XYZ structures out of the equilibrium (67 457)", 
        "checksum": "md5:de1dec75b35bc20bd685f049f95207a6", 
        "size": 27673749
      }, 
      {
        "key": "Chemical_diversity_Chemiscope.json.gz", 
        "description": "Chemiscope file containing the DFT/DFTB energies and structures of the database_PSC_eqm.xyz file.", 
        "checksum": "md5:c9ba5af5ff63c665a45327f67f6ec66c", 
        "size": 1738049
      }, 
      {
        "key": "Conformational_sampling_Chemiscope.json.gz", 
        "description": "Chemiscope file containing the DFT/DFTB energies and structures of the database_PSC_out_of_eqm.xyz file.", 
        "checksum": "md5:8b316906e01efbde8ca0caebb9ba2179", 
        "size": 25865697
      }, 
      {
        "key": "Normalization_for_LKR.py", 
        "description": "Python script to normalize DFT and DFTB energies and then to obtain delta energies to be trained with LKR-OMP", 
        "checksum": "md5:d7da11242f653087b172795dd697b921", 
        "size": 1178
      }
    ], 
    "references": [
      {
        "comment": "To be submitted, will be updated soon", 
        "citation": "F. C\u00e9lerse, M.D. Wodrich, S. Vela, S. Gallarati, R. Fabregat., V. Juraskova, C. Corminboeuf, 2023", 
        "type": "Preprint"
      }
    ], 
    "description": "Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond \u201csimple\u201d drug-like compounds or molecules comprised of well-defined building blocks (e.g., peptides) is challenging, as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversity. Here, we introduce the OFF\u2013ON (Organic Fragments From Organocatalysts that are Non-modular) database, a repository of 7,869 equilibrium and 67,457 non--equilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a Local Kernel Regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF\u2013ON dataset offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound comprised of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.", 
    "status": "published", 
    "license": "Creative Commons Attribution 4.0 International", 
    "conceptrecid": "2010", 
    "is_last": true, 
    "mcid": "2023.189", 
    "edited_by": 576, 
    "id": "2011", 
    "owner": 1211, 
    "license_addendum": null, 
    "doi": "10.24435/materialscloud:pz-2y"
  }, 
  "revision": 3, 
  "created": "2023-12-05T14:11:50.165605+00:00"
}