HEA25S Dataset

This README file provides an overview of the dataset included in this submission, which contains a collection xyz files, VASP inputs used for the calculations, and final models trained in the study. This dataset describes FCC and BCC and HCP bulk and surface slab configurations of high-entropy alloys (HEAs) containing up to 25 d-group elements (all the first three periods except Tc, Re, Os, Hg, Cd), as published in the article https://arxiv.org/abs/2310.07604.

Dataset content

  • README.md (this file)
  • data.zip
  • models.zip
  • vasp_settings.zip

data.zip

The file data.zip contains the HEA25S dataset with 5 different classes of HEA data used in the study with up to 25 d-group elements (all the first three periods except Tc, Re, Os, Hg, Cd):

  • Dataset O: 10000 bulk HEA configurations, both ideal crystals and crystals with distorted atomic positions
  • Dataset A: 2640 HEA surface slab configurations, both ideal slabs and slabs with distorted atomic positions
  • Datset B: 1000 HEA bulk configurations, sampled from replica-exchange MD trajectory within the temperatures range from 2000 K to 4000 K
  • Dataset C: 1000 HEA surface slab configurations, sampled from replica-exchange MD trajectory within the temperatures range from 300 K to 1500 K
  • Dataset D: 500 Cantor-style HEA surface slab configurations, sampled from replica-exchange MD trajectory within the temperatures range from 300 K to 1500 K

Each configuration is accompanied by its corresponding energies, forces and stress tensor computed using VASP. Volumes are set to random perturbations around the mean molar volume of each composition, with some structures being in the ideal lattice positions and some in distorted configurations. In the case of surface slab configurations, a vacuum layer of 20 Å was added to avoid the periodic surface interactons. Each configuration contains the data on its energy (in eV), interatomic forces (in eV/Å) and stresses eV/Å^2.

Each dataset is splitted to on the train, validation and test parts, which were used for model training and learning curves building.

models.zip

The file models.zip contains HEA25-4-NN and HEA25S-4-NN models in a PyTorch model dict format, which were used and trained in the study.

  • HEA25-4-NN model is trained on the dataset of 25000 bulk HEA structures with both ideal and distorted atomic configurations
  • HEA25S-4-NN model was initialzed from HEA25-4-NN, and then trained on the HEA25S dataset described above.

vasp_settings.zip

The file vasp_settings.zip contains the INCAR file used for the VASP calculations