The uploaded files contain structures and ids for ~2M compounds from AFLOW-LIB (~900k), the materials project (~100k) and our own group (~1M). The data was obtained by collecting all calculations with compatible parameters and removing duplicates (details in the paper). For the AFLOW-LIB and materials project data the structures and energies can be downloaded using the provided IDs. For our data we directly provide the relaxed structures and energies. The computed_entries.tar.gz contains all data from our group except the mixed perovskites that were predicted that can be found in predicted_mixed_perovskites_computed_entries.tar.gz The aflow systems that were identified as outliers during the calculation of the distance to the convex hull and that were consequently ignored for the hull calculation in the paper, were recalculated to ensure their correctness and now used for the hull calculation of all the compounds uploaded here (except the predicted mixed perovskites). Furthermore, a few new/corrected systems from the materials project were added/changed. Consequently, this dataset now has a higher average distance to the convex hull than in the publication as the convex hull is more complete. Consequently, this dataset now has a higher average distance to the convex hull than in the publication as the convex hull is more complete. ------------------------------------ To read computed_entries.tar.gz in python uncompress it and (same for predicted_mixed_perovskites_computed_entries.tar.gz): ------------------------------------ #!/usr/bin/env python import json from pymatgen.entries.computed_entries import ComputedStructureEntry data = json.load(open('computed_structure_entries.json','r')) entries = [ComputedStructureEntry.from_dict(i) for i in data] ------------------------------------ The Id files can be loaded in the same manner in python import json data = json.load(open('..','r')) but directly provide a python dictionary with the four keys 'id', 'd_e_hull', 'e-form', 'spg'. The dictionary entries contain lists with the corresponding values.