The rule of four: anomalous stoichiometries of inorganic compounds

The files represent the two databases used for analysis in the paper: The rule of four: anomalous stoichiometries of inorganic compounds.

Better instructions on how to re-run the code can be found on github: https://github.com/epfl-theos/r4-project.

The Materials Cloud 3-dimensional crystal structures "source" database

The MC3D.tar.xz represents the Materials Cloud 3-dimensional crystal structures "source" database (MC3D-source), containing 79854 inorganic structures. The compressed file contains

  • the SOAP vectors calculations (soap.npz)
  • the outcomes of the classification algorithms (classif.npz)
  • the file containing all the calculated geometric descriptors (geo.npz)
  • the .json file that can be imported in the Chemiscope app (https://chemiscope.org/) to visualize the data (chem.json.gz)

The full .xyz structure file cannot be released due licensing constraints. However, a full list of database versions and IDs for each structure obtained from the three databases (MPDS, ICSD, COD) composing the MC3D-souce database is provided in the MC3D_ids.yaml file.

The Materials Project Database

The MP.tar.xz represents the Materials Project crystal structures database (MP), containing 83989 inorganic structures. The compressed file contains

  • the publicly available structures from 2018 (structures.xyz)
  • the SOAP vectors calculations (soap.npz)
  • the outcomes of the classification algorithms (classif.npz)
  • the file containing all the calculated geometric descriptors (geo.npz)
  • the .json file that can be imported in the Chemiscope app (https://chemiscope.org/) to visualize the data (chem.json.gz).