We present the K-edge XANES database of 3d transition metal-containing oxide materials used for machine learning modeling in this manuscript: S. R. Kharel et al. A Universal Deep Learning Framework for Materials X-ray Absorption Spectra. arXiv:2409.19552.
Structures used to generate spectra were sourced from the Materials Project using a “wildcard search” via the v2 API, e.g. “Ti-O-*” for Ti ternary oxides. The Lightshow software package (github.com/AI-multimodal/Lightshow; Journal of Open Source Software, 8, 5182 (2023)) was used to pull materials and create the input files for spectroscopy calculations. Pymatgen is used as the backend for determining symmetrically inequivalent sites. The database contains 8824 Ti, 14697 V, 4048 Cr, 19575 Mn, 14752 Fe, 13471 Co, 5335 Ni and 5299 Cu FEFF spectra, and 3941 Ti and 3242 Cu VASP spectra.
Each compound is indexed by its Materials Project ID (mpid), then by symmetrically inequivalent absorbing site, e.g. 000_Ti (which indexes the 0th site being a Ti absorber). An extra “SCF” directory is included for self-consistent field calculations in VASP. Each site directory contains all input files (except the POTCAR file for VASP, which requires a VASP license) and selected output files, including the spectra.
Note that files can be decompressed using e.g. tar -xjvf FEFF.tar.bz2
The FEFF directories contain the following FEFF input file: feff.inp. This completely specifies the FEFF input and is generated by Lightshow using Pymatgen as a backend. Included output files are feff.out (containing the output logs of the calculation) and xmu.dat (the output spectra as well as other information).
The VASP directories contain the following VASP input files:
The calculated spectra files (and related) include:
The files, scfenergy.txt, scfenergy.txt and efermi.txt, are used for relative edge alignment using the DeltaSCF method, as described in the multi-code benchmark paper [Phys. Rev. Materials 8, 013801 (2024)].