Recommended by

Indexed by

Semi-local and hybrid functional DFT data for thermalised snapshots of polymorphs of benzene, succinic acid, and glycine

Edgar A. Engel1*, Venkat Kapil2,3*

1 Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, 19 JJ Thomson Avenue, Cambridge, CB3 0HE UK

2 Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK

3 Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Vaud, Switzerland

* Corresponding authors emails: eae32@cam.ac.uk, vk380@cam.ac.uk
DOI10.24435/materialscloud:vp-jf [version v1]

Publication date: Mar 26, 2021

How to cite this record

Edgar A. Engel, Venkat Kapil, Semi-local and hybrid functional DFT data for thermalised snapshots of polymorphs of benzene, succinic acid, and glycine, Materials Cloud Archive 2021.51 (2021), doi: 10.24435/materialscloud:vp-jf.


Structure prediction for molecular crystals is a longstanding challenge, as often minuscule free energy differences between polymorphs are sensitively affected by the description of electronic structure, the statistical mechanics of the nuclei and the cell, and thermal expansion. The importance of these effects has been individually established, but rigorous free energy calculations, which simultaneously account for all terms, have not been computationally viable. Here we reproduce the experimental stabilities of polymorphs of prototypical compounds -- benzene, glycine, and succinic acid -- by computing rigorous first-principles Gibbs free energies, at a fraction of the cost of conventional methods. This is achieved by a bottom-up approach, which involves generating machine-learning potentials to calculate surrogate free energies and subsequently calculating true first-principles free energies using inexpensive free energy perturbations. Accounting for all relevant physical effects is no longer a daunting task and provides the foundation for structure predictions for more complex systems of industrial importance. This Materials Cloud archive contains first-principles training, validation, and test data for polymorphs of benzene, succinic acid, and glycine underlying the above-mentioned machine-learning potentials. For each compound the archive provides two sets of data: the first based on DFT calculations with the semi-local PBE functional and the Tkatchenko-Scheffler dispersion correction, and the second based on DFT calculations with the hybrid PBE0 functional and the many-body dispersion correction of Tkatchenko et al. For each compound and both levels of electronic-structure theory, structure datasets in libatom extended-xyz format provide representative, thermalised configurations and the associated configurational energies, atomic forces, and stresses on the simulation cell. The configurations are extracted from a combination of classical temperature replica exchange molecular dynamics simulations and path-integral molecular dynamics for a representative set of perturbed unit cells based on the experimental structures of: forms I, II, Ihp and V' of benzene, alpha- and beta-succinic acid, and alpha-, beta-, gamma-, and delta-glycine. The detailed data provenance and dataset architecture is described in the attached README file.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
295.8 KiB README file detailing the exact provenance of the first-principles data constituting the data files.
129.1 MiB Zip archive containing the first-principles data detailed in the README pdf-file.


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Preprint (Preprint where the data is discussed and used to construct machine-learning models for highly accurate Gibbs free energy calculations)
Preprint (Preprint where a subset of the benzene data is discussed and used to test the impact of feature selection techniques on machine-learning models)


SNSF MARVEL first principles hybrid-functional DFT molecular crystals

Version history:

2021.51 (version v1) [This version] Mar 26, 2021 DOI10.24435/materialscloud:vp-jf