The uploaded files contain structures and ids for ~2M compounds from AFLOW-LIB (~900k), the materials project (~100k) and our own group (~1M). The data was obtained by collecting all calculations with compatible parameters and removing duplicates (details in the paper). For the AFLOW-LIB and materials project data the structures and energies can be downloaded using the provided IDs. For our data we directly provide the relaxed structures and energies. The computed_entries.tar.gz contains all data from our group except the mixed perovskites that were predicted that can be found in predicted_mixed_perovskites_computed_entries.tar.gz ------------------------------------ To read computed_entries.tar.gz in python uncompress it and (same for predicted_mixed_perovskites_computed_entries.tar.gz): ------------------------------------ #!/usr/bin/env python import json from pymatgen.entries.computed_entries import ComputedStructureEntry data = json.load(open('computed_structure_entries.json','r')) entries = [ComputedStructureEntry.from_dict(i) for i in data] ------------------------------------ The Id files can be loaded in the same manner in python import json data = json.load(open('..','r')) but directly provide a python dictionary with the four keys 'id', 'd_e_hull', 'e-form', 'spg'. The dictionary entries contain lists with the corresponding values.