This entry contains the files needed to reproduce all results presented in
the corresponding paper.

In particular:
- A file named `ACWF-verification-data-and-scripts.zip` that contains the
  following subfolders:
  
  - `reference-dataset`: a folder containing the reference dataset (from the
    average of the two all-electron codes) discussed in the paper. 
    There are two JSON files:

    - `results-oxides-verification-PBE-v1-AE-average.json`
    - `results-unaries-verification-PBE-v1-AE-average.json`

    that contain the parameters of the Birch-Murnaghan EOS for each of the 960
    structures of the paper. E.g. for the parameters of Ac in the BCC structure,
    these can be obtained from the "unaries" file via
    
      data["BM_fit_data"]["Ac-X/BCC"]

    where the relevant parameters are:

      - `min_volume`: value of the volume in angstrom^3
      - `bulk_modulus_ev_ang3`: value of B_1 in eV/angstrom^3
      - `bulk_deriv`: value of B_1

    Note that these parameters are extracted from the E vs. V curve in the
    simulation cell. To know how many atoms there are in the simulation cell,
    check the value in:

      data["num_atoms_in_sim_cell"]["Ac-X/BCC"]

  - `acwf-verification-scripts`: a complete set of scripts and data that are
    needed both to run more simulations and extract the data, and to regenerate
    all figures (main text and supplementary) of the paper.

    The files are extracted from the git repository hosted on GitHub at:

      https://github.com/aiidateam/acwf-verification-scripts/

    (commit 83837daf4532b3bfdc03bceaa28a649d698f62d2).

    In particular, a few subfolders are relevant:

    - `acwf_paper_plots`: scripts to regenerate all figures and tables in the
      paper, divided by plots and tables, and by main text and supplementary.
      Running the scripts requires running `pip install -e .` in the main
      folder first.
    - `acwf_paper_plots/code-data` subfolder: this contains JSON files with
      the data for each code (that is also shown interactively on
      https://acwf-verification.materialscloud.org). Each dataset (unaries or 
      oxides, for each code/numerical approach) has its own JSON 
      (generated by the scripts in folder `3-analyze` once the .aiida files
      are imported). 
      The file format is the same as described above for the reference dataset.
      The reference dataset is also reported again here for convenience.
      Note that the stress information might be missing or, in
      some cases, incorrect and should not be used; it is never used in the
      paper.
    - `0-preliminary-do-not-run/oxides/xsfs-oxides-verification-PBE-v1/` and
      `0-preliminary-do-not-run/unaries/xsfs-unaries-verification-PBE-v1/`:
      initial (central) structures for each of the 960 structures considered
      here, in XSF format (defined by XCrysDen and whose specifications
      can be found here: http://www.xcrysden.org/doc/XSF.html)
    - the other folders named `1-...`, `2-...` contain the scripts to rerun
      the whole dataset with any supported code; instructions are provided
      in the main README.md file of the `acwf-verification-scripts` folder.

  - `README-data-reuse.txt`: a summary of the Recommendation Box #3 in the main
    text, with the most important recommendations on how to reuse the dataset
    for further analysis, including in particular which parameters should be
    fixed (k-points, smearing), etc.

  - `all-electron-setups`: detailed information on the calculation setups used
    by the two all-electron codes involved in this study, such as muffin-tin
    radii etc., for each of the structures. A file named `all-electron-data.md`
    in the folder documents the format, and the information is stored in
    JSON files.
  
  - `VASP-pseudopotential-information`: information to unambiguously define
    the PAW potentials used in the dataset generated with the VASP code.
    A README.txt file documents the format, and the information is stored in
    a YAML file.


- We also provide .aiida archive files, generated with AiiDA 1.6, with the full
  provenance of all calculations performed in this study. For each method and
  numerical approach, we present two files, one for the unaries dataset and one
  for the oxides dataset. The filename is of the form:

    acwf-verification_<DATASET>-verification-PBE-v1_results_<CODENAME>.aiida

  where <DATASET> is either `oxides` or `unaries`, and <CODENAME> is a string
  containing the code name and possibly some internal suffix string.
  The relation from the strings of the numerical approaches used in the paper
  and the <CODENAME> used here is the following.

  For the all-electron codes:

  -  `FLEUR@LAPW+LO`: `fleur_testPrecise_22`
  -  `WIEN2k@(L)APW+lo+LO`: `wien2k`

  For the pseudopotential codes:

  - `ABINIT@PW|PseudoDojo-v0.5`: `abinit_PseudoDojo_0.5b1_PBE_SR_standard_psp8`
  - `CASTEP@PW|C19MK2`: `castep`
  - `CP2K/Quickstep@TZV2P|GTH`: `cp2k_TZV2P`
  - `GPAW@PW|PAW-v0.9.20000`: `gpaw`
  - `Quantum ESPRESSO@PW|SSSP-prec-v1.3`: `quantum_espresso-SSSP-1.3-PBE-precision`
  - `SIESTA@AtOrOptDiamond|PseudoDojo-v0.4`: `siesta`
  - `SIRIUS/CP2K@PW|SSSP-prec-v1.2`: `cp2k_SIRIUS`
  - `VASP@PW|GW-PAW54`: `vasp`

  The only exception is `BigDFT@DW|HGH-K(Valence)`, for which a single file
  named `BigDFT_acwf_chunked.tar` is provided: it contains a number of .aiida
  files that, when imported, provide the whole dataset (this was exported in
  chunks for technical reasons to avoid memory issues, as the files would be
  very large).


## How to use the AiiDA archives

To import a file in AiiDA, after having installed AiiDA, run:
`verdi archive import <FILENAME>` (possibly create a new profile first, if you
want to import the data in a new profile; refer to the AiiDA documentation
for more details).

Once imported, there will be a number of groups named, for instance:
`acwf-verification/<DATASET>-verification-PBE-v1/workflows/<CODENAME>`
where <DATASET> can be `oxides` or `unaries`, and <CODENAME> is typically the
same given by the file name as discussed above (in case of doubt, run 
`verdi group list -A` to see all group names).
The group will contain all relevant WorkChains that were run to obtain the EOS
datapoints.

The snippet below is an example on how to explore the data (here we use as
an example GPAW; feel free to modify the script, or see more advanced ones
in the folder `acwf-verification-scripts` inside the
`ACWF-verification-data-and-scripts.zip` file).

The script below loops over all unary structures, and for each it prints a
header with the element name and the configuration (SC, BCC, FCC or diamond),
followed by lines for each (V, E) value (V in angstrom^3, E in eV).

You can write this script in a file and run it with `verdi run`, make it 
executable and run it, or copy-paste it in `verdi shell`.

```
#!/usr/bin/env runaiida
import numpy as np
from aiida.orm import load_group
from aiida.common import LinkType

WORKFLOWS_GROUP_LABEL = f'acwf-verification/unaries-verification-PBE-v1/workflows/gpaw'

group = load_group(WORKFLOWS_GROUP_LABEL)

for node in group.nodes:
    structure = node.inputs.structure
    print(f"# {structure.extras['element']} {structure.extras['configuration']}")    

    # Collect all volumes and energies for this system
    volumes = []
    energies = []
    # Filter successful workflows
    if node.process_state.value == 'finished' and node.exit_status == 0:
        # Get all output links of the workflow, of type return
        outputs = node.get_outgoing(link_type=LinkType.RETURN).nested()
        # Loop over all output structures, get the volume and the corresponding
        # energy
        for index, sub_structure in sorted(outputs['structures'].items()):
            volumes.append(sub_structure.get_cell_volume())
            energies.append(outputs['total_energies'][index].value)

    # Sort (V, E) pairs
    energies = [e for _, e in sorted(zip(volumes, energies))]
    volumes = sorted(volumes)
    # print volume and energy
    for V, E in zip(volumes, energies):
        print(f"{V} {E}")
    print()
```