Published September 18, 2020 | Version v2
Dataset Open

Randomly-displaced methane configurations

  • 1. Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
  • 2. National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

* Contact person

Description

Most of the datasets to benchmark machine-learning models contain minimum-energy structures, or small fluctuations around stable geometries, and focus on the diversity of chemical compositions, or the presence of different phases. This dataset provides a large number (7732488) configurations for a simple CH4 composition, that are generated in an almost completely unbiased fashion. Hydrogen atoms are randomly distributed in a 3A sphere centered around the carbon atom, and the only structures that are discarded are those with atoms that are closer than 0.5A, or such that the reference DFT calculation does not converge. This dataset is ideal to benchmark structural representations and regression algorithms, verifying whether they allow reaching arbitrary accuracy in the data rich regime.

Files

File preview

files_description.md

All files

Files (1.1 GiB)

Name Size
md5:928d16c74edbc58df67d611a012da787
269 Bytes Preview Download
md5:11cf7303d8c0fa6ef753103f5439d6e1
1.1 GiB Download
md5:182e0287b35d7385e03660b36ca91432
1.4 KiB Preview Download

References

Preprint
S. N. Pozdnyakov, M. J. Willatt, A. P. Bartók, C. Ortner, G. Csányi, M. Ceriotti, arXiv:2001.11696 (2020)

Preprint
J. Nigam, S. Pozdnyakov, M. Ceriotti, arXiv:2007.03407 (2020)