Semi-local and hybrid functional DFT data for thermalised snapshots of polymorphs of benzene, succinic acid, and glycine


JSON Export

{
  "metadata": {
    "is_last": true, 
    "version": 1, 
    "title": "Semi-local and hybrid functional DFT data for thermalised snapshots of polymorphs of benzene, succinic acid, and glycine", 
    "keywords": [
      "SNSF", 
      "MARVEL", 
      "first principles", 
      "hybrid-functional DFT", 
      "molecular crystals"
    ], 
    "description": "Structure prediction for molecular crystals is a longstanding challenge, as often minuscule free energy differences between polymorphs are sensitively affected by the description of electronic structure, the statistical mechanics of the nuclei and the cell, and thermal expansion. The importance of these effects has been individually established, but rigorous free energy calculations, which simultaneously account for all terms, have not been computationally viable.\nHere we reproduce the experimental stabilities of polymorphs of prototypical compounds -- benzene, glycine, and succinic acid -- by computing rigorous first-principles Gibbs free energies, at a fraction of the cost of conventional methods. This is achieved by a bottom-up approach, which involves generating machine-learning potentials to calculate surrogate free energies and subsequently calculating true first-principles free energies using inexpensive free energy perturbations.\nAccounting for all relevant physical effects is no longer a daunting task and provides the foundation for structure predictions for more complex systems of industrial importance.\n\nThis Materials Cloud archive contains first-principles training, validation, and test data for polymorphs of benzene, succinic acid, and glycine underlying the above-mentioned machine-learning potentials. \nFor each compound the archive provides two sets of data: the first based on DFT calculations with the semi-local PBE functional and the Tkatchenko-Scheffler dispersion correction, and the second based on DFT calculations with the hybrid PBE0 functional and the many-body dispersion correction of Tkatchenko et al. For each compound and both levels of electronic-structure theory, structure datasets in libatom extended-xyz format provide representative, thermalised configurations and the associated configurational energies, atomic forces, and stresses on the simulation cell. The configurations are extracted from a combination of classical temperature replica exchange molecular dynamics simulations and path-integral molecular dynamics for a representative set of perturbed unit cells based on the experimental structures of: forms I, II, Ihp and V' of benzene, alpha- and beta-succinic acid, and alpha-, beta-, gamma-, and delta-glycine. The detailed data provenance and dataset architecture is described in the attached README file.", 
    "license": "Creative Commons Attribution 4.0 International", 
    "references": [
      {
        "url": "https://arxiv.org/abs/2102.13598", 
        "comment": "Preprint where the data is discussed and used to construct machine-learning models for highly accurate Gibbs free energy calculations", 
        "citation": "V. Kapil, E. A. Engel, arXiv preprint, arXiv:2102.13598 [cond-mat.mtrl-sci] (2021)", 
        "type": "Preprint"
      }, 
      {
        "url": "https://arxiv.org/abs/2012.12253", 
        "comment": "Preprint where a subset of the benzene data is discussed and used to test the impact of feature selection techniques on machine-learning models", 
        "citation": "R. K. Cersonsky, B. A. Helfrecht, E. A. Engel, M. Ceriotti, arXiv preprint, arXiv:2012.12253 [physics.chem-ph] (2020)", 
        "type": "Preprint"
      }
    ], 
    "doi": "10.24435/materialscloud:vp-jf", 
    "conceptrecid": "790", 
    "publication_date": "Mar 26, 2021, 18:29:34", 
    "edited_by": 100, 
    "_oai": {
      "id": "oai:materialscloud.org:791"
    }, 
    "contributors": [
      {
        "affiliations": [
          "Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, 19 JJ Thomson Avenue, Cambridge, CB3 0HE UK"
        ], 
        "email": "eae32@cam.ac.uk", 
        "familyname": "Engel", 
        "givennames": "Edgar A."
      }, 
      {
        "affiliations": [
          "Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge,  CB2 1EW, UK", 
          "Laboratory of Computational Science and Modeling, Institut des Mat\u00e9riaux, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, CH-1015 Lausanne, Vaud, Switzerland"
        ], 
        "email": "vk380@cam.ac.uk", 
        "familyname": "Kapil", 
        "givennames": "Venkat"
      }
    ], 
    "owner": 1, 
    "license_addendum": null, 
    "mcid": "2021.51", 
    "_files": [
      {
        "size": 302941, 
        "checksum": "md5:78d51ded6154a64096d339833d9d7c16", 
        "description": "README file detailing the exact provenance of the first-principles data constituting the data files.", 
        "key": "README.pdf"
      }, 
      {
        "size": 135355233, 
        "checksum": "md5:4299d8d227510bd4abcd3be3059e38fc", 
        "description": "Zip archive containing the first-principles data detailed in the README pdf-file.", 
        "key": "dataset.zip"
      }
    ], 
    "id": "791", 
    "status": "published"
  }, 
  "revision": 11, 
  "updated": "2021-12-06T12:32:54.679935+00:00", 
  "created": "2021-03-24T16:58:03.242571+00:00", 
  "id": "791"
}