# Importing and inspecting the provenance of the simulations
We provide an AiiDA export file (`automatic_wannier_provenance.aiida`) containing the full 
data and logic provenance of all simulations performed in the related scientific 
article, including in particular all calculations (with their inputs and outputs)
used from the relaxation of the very initial structure to the calculation
of the final interpolated band structure.

**Note**: the simulations have been run with AiiDA v0.10. Before
exporting, we have migrated the database to AiiDA v1.0. Therefore,
you will need AiiDA v1.0 or later to import this database (and the data
and calculations have been migrated to be compliant with the provenance
model of AiiDA v1.0).

## How to import
Install AiiDA v1.0 and then run:
```
verdi import automatic_wannier_provenance.aiida
```
In the following we also describe how to access the data

## Groups
A number of groups are present in the export file and will be
created in your database upon import. You can show them by running
```
verdi group list -A
```

We list below their names and their content.

**Note**: you need to replace the string `{KPOINTS_MESH}` with the
appropriate k-points mesh you are interested into. The valid values
that we have computed and described in the paper, and that can be
found in the export file, are: `0.15`, `0.2`, `0.3`, `0.4`.

All groups contain the last Wannier90 calculation that generated the
final band structure. In particular:

- `AutoWannier-valence_only-SCDM_only-{KPOINTS_MESH}`: Simulations
  for the set of insulating materials, considering only valence bands, 
  using the initial SCDM projections only without further MLWF procedure.
- `AutoWannier-valence_only-SCDM+MLWF-{KPOINTS_MESH}`: Simulations
  for the set of insulating materials, considering only valence bands, 
  using the initial SCDM projections followed by a MLWF minimisation procedure.
- `AutoWannier-valence_only-random+MLWF-{KPOINTS_MESH}`: Simulations
  for the set of insulating materials, considering only valence bands, 
  starting from random projections followed by a MLWF minimisation procedure.
- `AutoWannier-with_conduction-SCDM_only-{KPOINTS_MESH}`: Simulations
  for the set of insulating and metallic materials, considering also
  the lowest-energy conduction bands, using the initial SCDM projections only
  without further MLWF procedure.
- `AutoWannier-with_conduction-SCDM+MLWF-{KPOINTS_MESH}`: Simulations
  for the set of insulating and metallic materials, considering also 
  the lowest-energy conduction bands, using the initial SCDM projections 
  followed by a MLWF minimisation procedure.

## Inspecting the data
You can open a `verdi shell` and load the nodes in the group using the usual AiiDA commands.

We show here an example to search and export the crystal structure and
the interpolated band structure of CaO, 
for the case of a k-points mesh with spacing 0.2 and using SCDM+MLWF:
```
group = Group.get(label='AutoWannier-with_conduction-SCDM+MLWF-0.2')
# Create a dictionary like {'formula': w90_calculation, ...}
all_w90_calculations = {w90calc.inputs.structure.get_formula(): w90calc for w90calc in group.nodes}
# Get the calculation for calcium oxide
w90calc_CaO = all_w90_calculations['CaO']
# Show on screen the identifier of this calculation
print("CaO calculation UUID: {}".format(w90calc_CaO.uuid))

# Export the input structure in XSF format in this folder
w90calc_CaO.inputs.structure.export('CaO.xsf')

# Export the output interpolated band structure in xmgrace format in this folder
w90calc_CaO.outputs.interpolated_bands.export('CaO-w90.agr')
```

If you also want to get the corresponding DFT band structure to compare, you can
use the following utility function:
```
def get_corresponding_pw_bands(w90calc):
    """
    Given a W90 calculation, finds all all PwBaseWorkChains that used as input
    the same structure AND the same k-points, and returns the corresponding output bands.

    Return None if no matching PwBaseWorkChain is found.

    Raises ValueError if more than one matching is found.
    """
    from aiida.common import exceptions as exc

    # Get all PwBaseWorkChains that used as input the same structure AND the same kpoints
    matching_pw_wfs = [
        link.node for link in w90calc.inputs.structure.get_incoming(node_class=WorkChainNode).all()
        if link.node.get_attribute('process_label')=='CustomPwBandStructureWorkChain']

    if not matching_pw_wfs:
        return None
    if len(matching_pw_wfs) > 1:
        raise ValueError("More than one workflow found: {}".format(", ".join(wf.pk for wf in matching_pw_wfs)))

    try:
        pw_bands = matching_pw_wfs[0].outputs.band_structure
    except exc.NotExistent:
        return None

    return pw_bands
```

and then you can run the following code in the same interpreter:
```
get_corresponding_pw_bands(w90calc_CaO).export('CaO-DFT.agr')
```

You can now compare the two band structures by showing them at the same
time using xmgrace, e.g. by running on the command line:
```
xmgrace CaO-DFT.agr CaO-w90.agr
```

## Visualizing the graph
You can inspect the provenance of a given calculation
by generating a PDF with the corresponding graph.

In the example above (for CaO), you can generate the graph
for the calculations associated with the band structure of CaO
(in the SCDM+MLWF case, with a k-grid mesh with spacing 0.2) using:
```
verdi node graph generate -i -o e620ba6e
```
(where `e620ba6e` is the starting part of the UUID of the
Wannier90 calculation, that was printed in the step before)