################################
### INTRODUCTION AND CONTENT ###
################################

This entry contains data and workflows to fully reproduce the results
published in the corresponding scientific paper about the automatic robust
Wannierisation and band-structure interpolation of materials 
using the SCDM method and our automation protocol:

  Valerio Vitale, Giovanni Pizzi, Antimo Marrazzo,
  Jonathan R. Yates, Nicola Marzari, Arash A. Mostofi,
  npj Computational Materials 6, 66 (2020)
  doi:10.1038/s41524-020-0312-y

In particular, this entry contains the AiiDA database with the full
provenance of the simulations performed in the paper, as well as a
Virtual Machine ("The Wannierising Machine", version 19.07) that 
contains AiiDA, Quantum ESPRESSO, Wannier90 and the respective 
workflows and software needed to re-run the same simulations of the paper
or even run new simulations with the same protocol.

##################################################################
### FURTHER INSTRUCTIONS AND READMES FOR THE DIFFERENT CONTENT ###
##################################################################

Various README files and data are provided:

- **AiiDA provenance**: In order to inspect the data generated during 
  the simulations and inspect their full provenance as stored by AiiDA, 
  follow the instructions in the README-AiiDA.txt" file. 
  The AiiDA database, ready to be imported, with the provenance
  of all calculations run in the project, is in the export file
  "automatic_wannier_provenance.aiida" (it requires AiiDA 1.0 or later).

- **Run new simulations with AiiDA in a Virtual Machine**: In order to
  install the virtual machine and run a Wannierisation, use the Virtual 
  Machine image "wannierising_machine_19.07.ova", following the 
  instructions in the "README-virtual-machine.txt" file for the installation
  and in "tutorial-with-screenshots-VM.pdf" to run the Wannierisation.

- **Recreate automatically the Virtual Machine from scratch**: In order to
  recreate the virtual machine from scratch, use the ansible 
  scripts that are provided in the file 
  "wannierising_machine_19.07_ansible_scripts.tar.gz"

- **Input crystal structures**: You can find the data of the crystal 
  structures used in this work in the two files "xsf.tar.gz"
  (200 metals and insulators when considering also conduction bands) and
  "xsf_insulators.tar.gz" (81 insultators when considering only valence 
  bands). This data is also stored inside the virtual machine. 
  The crystal structures are stored in XSF format (whose 
  specifications can be found here: http://www.xcrysden.org/doc/XSF.html)

- **Data on spreads and bands distance**: the JSON file 
  `automated_wannier_discover_data.json` contains information on all the
  systems simulated in the project, including in particular the UUID
  of the relevant crystal-structure and band-structure nodes, as well as
  the spread of the Wannier functions and the bands distance.

  More specifically, the schema of the JSON is the following:
  - Top level dictionary: keys are chemical formulas, values are dictionaries 
    with the following schema:
    - "structure_uuid": UUID of the input crystal structure
    - "bands": dictionary with the following schema:
      - "DFT_uuid": UUID of the DFT bands calculation
      - "Wannier": dictionary where the key indicates the dataset (either 
        "with_conduction" or "valence_only"; often there is only one of these
        two keys) and the value is dictionary with the following schema:
        - the key indicates the method ("SCDM_only", "SCDM+MLWF" or 
          "random+MLWF", the latter existing only in the "valence_only" 
          dataset), and the value is a dictionary with the following schema:
          - the key is the k-points mesh target linear density in angstrom^-1,
            as described in the paper, as a string (valid values: "0.15",
            "0.2", "0.3", "0.4"), and the value is a dictionary with the 
            following schema:
            - "bands_node_uuid": UUID of the Wannier bands calculation
            - "total_spread": total spread computed by Wannier90 in angstrom^2
            - "gauge_invariant_spread": gauge-invariant component of the spread
              computed by Wannier90 in angstrom^2
            - "eta": average bands distance (see definition in the paper) in eV
            - "eta_max": max bands distance (see definition in the paper) in eV        
  
- **Band structures**: You can find a PDF with all band structures studied in
  the paper in the PDF file `Vitale-2020-all-bands.pdf`, for easy inspection.
  The first page of the PDF describes in more detail the content of the file.

- **Fermi energies**: The file `fermi_energies.json` contains the Fermi energy
  for each material, as returned by Quantum ESPRESSO. The JSON contains
  a dictionary. Each key of the dictionary is the chemical formula of the material,
  and the value is the Fermi energy of the material in eV.

########################################
### HOW TO CITE AND ACKNOWLEDGEMENTS ###
########################################

When using the data or the virtual machine, we kindly ask you to 
please cite also the corresponding scientific paper:

  Valerio Vitale, Giovanni Pizzi, Antimo Marrazzo,
  Jonathan R. Yates, Nicola Marzari, Arash A. Mostofi,
  npj Computational Materials 6, 66 (2020)
  doi:10.1038/s41524-020-0312-y