Self energy and Eliashberg function extraction from angle resolved photoemission spectroscopy using the xARPES code

T. P. van Waas, C. Berthod, J. Berges, N. Marzari, J. H. Dil, and S. Poncé

Python version

The Python scripts for the manuscript were executed with Python V3.10.12.

Core packages

The following package versions are recommended for reproducing the results:

abipy 0.9.8
blochpesto 1.1.1
igor2 0.5.12
lmfit 1.3.1
matplotlib 3.8.4
mpmath 1.3.0
numba 0.60.0
numpy 1.26.4
pandas 2.2.2
pymatgen 2024.10.29
scipy 1.13.0

Example of correct python environment using Conda.

When using Conda, the suggested versions can be installed with:

conda create -n xarpes python=3.10.12
conda activate xarpes
pip install abipy==0.9.8
pip install blochpesto==1.1.1
pip install igor2==0.5.12
pip install lmfit==1.3.1
pip install matplotlib==3.8.4
pip install mpmath==1.3.0
pip install numba==0.60.0
pip install numpy==1.26.4
pip install pandas==2.2.2
pip install pymatgen==2024.10.29
pip install scipy==1.13.0

The Data folder

The Data folder contains additional data to generate the Figures. They are provided separately when deemed too large for the separate Figures_and_tables folder. In some cases, these files can be regenerated while running the postprocessing scripts. In exceptional cases, they are the output from heavier calculations. For example, the Supp_Fig_4_data folder can be copied inside Figures_and_tables/Supp_Fig_4/ to post-process the ABINIT calculations.

The Figures_and_tables folder

Most of the folders contain processing scripts, as well as a final generate_figure_n.py ( script to generate the $n^{\text{th}}$ figure. For all of the figure folders, one can either copy-paste everything from Data/Figure_n_data (or Data/Supp_Fig_n_data) into Figures_and_tables/Figure_n/Figure_n_data (Figures_and_tables/Supp_Fig_n/Supp_Fig_n_data) or copy-paste the minimum required files (specified in each case), after which the figure can be generated with generate_fig_n.py (generate_figure_sn.py).
Warnings about the pcolormesh during figure creation are intended to make the user aware of figure warping.
The Arial font has occasionally been used for panel labels and added text. If Arial is not found, the figure generation scripts resorts to the Matplotlib default DejaVu Sans.
For reproducibility purposes, a seed value of 1 has been used for the noise generation in the artificial example analyzed with Figs. 3-4 and the tables. The reader may wish to verify that the presented results are robust w.r.t. the seed value.
Additional checks and intermediate figures in the scripts are commented, but they can be uncommented to inspect script performance in more detail.
For the output parameters, $m_{\mathrm{R}}^{\mathrm{b}}$ is equal to mrel times the default value (1.5875 $m_{\mathrm{e}}$), $k_{\mathrm{R}}^{\mathrm{F}}$ is equal to krel times the default value (0.25 Å$^{-1}$), $\Gamma_{\mathrm{R}}^{\mathrm{imp}}$ is equal to dive, $\lambda_{\mathrm{R}}^{\mathrm{el}}$ is equal to lmbe, and $h_{\mathrm{R}}$ is equal to $m_0$.

Figure_1

Figure 1 can be generated by exporting the Inkscape fig1v1.svg figure as a .pdf file.
The experimental geometry diagram is obtained by running Asymptote on fig_1a.asy, which is subsequently imported as a .png into fig1v1.svg.
The energy diagram has been added in Inkscape.

Figure_2

Figure 2 can be generated by executing the .tex file with pdflatex in $\LaTeX$.

Figure_3

Data folder Figure_3_data is initially empty, but can be populated by executing test_loop_se.py and analyse_mock.py. Alternatively, they can be copy-pasted from Data/Figure_3_data.
Almost all preprocessing for the figure is done with generate_figure_3.py, which generates an artificial band map, and performs the desired tests on it.
The exception is panel $\textbf{f}$, whose input is generated with test_loop_se.py, which performs the self-energy extraction for different Fermi energies and energy resolutions.

Figure_4

Data folder Figure_4_data is initially empty, but it can be populated by executing analyse_mock.py to generate results for panel $\textbf{a}$. This script generates an artificial band maps, followed by extracting from it the self-energy and Eliashberg function.
One runs analyse_ideal.py for panel $\textbf{b}$, which does the same thing as analyse_mock.py, except now with ideal data.
One runs test_loop_a2f_fast.py for panel $\textbf{c}$, which simulates data with noise for rpts cases, and may take several minutes to complete. If it is desired to quickly test the entire workflow, the variable rpts in test_loop_a2f_fast.py can be set to a lower number of iterations (say, 5). This variable represents the number of data pairs $N_{J}$ in the text.
For comparison, the anticipated output files are are also provided in Data/Figure_4_data.
The optimized parameters in analyse_mock.py under the header # Optimized right can be obtained by executing mock_optimisation.right.py. After an optimisation that might take a minute or so, the optimised parameters can be copied from the script artificial_einstein_right.txt created in Figure_4_data.

Figure_5

Data folder Figure_5_data is initially empty. Copy the experimental bandmap from Data/Figure_5_data/STO_2_0010STO_2_.txt into Figures_and_tables/Figure_5/Figure_5_data to execute the scripts described below.
The main analyses with and without the matrix element correction are performed with tio2_2023_single.py and tio2_2023_nomec.py. These scripts need to be executed first. Afterwards, the Bayesian loop scripts can be executed, which may take a minute or two to complete.
The parameters optimized by the loop are generated as follows. default_inner_left.py and default_inner_right.py write to STO_2_0010STO_2__inner_left.txt and STO_2_0010STO_2__inner_right.txt for the parameters in tio2_2023_single.py. The scripts default_right.py and nomec_right.py write to STO_2_0010STO_2__right.txt and STO_2_0010STO_2__mec_pars_right.txt for the parameters in tio2_2023_nomec.py

Figure_6

Data folder Figure_6_data is initially empty. Copy the experimental bandmap from Data/Figure_6_data/graphene_raw_cut_2.txt into Figures_and_tables/Figure_6/graphene_raw_cut_2.txt to execute the scripts described below.
The main analysis is performed in graphene_linearised.py. Afterwards, the Bayesian loop scripts left_linear_optimisation.py and right_linear_optimisation.py can be executed. Especially the former script can take longer than the other optimization scripts to complete -- say 5 to 10 minutes on a workstation -- as the cost function for these data is somewhat flat near the minimum.

Figure_7

Figure 7 shows the Fermi edge from Li-doped graphene. The results are obtained by executing graphene_cut2.py.

Supp_Fig_1

The placeholder directory Supp_Fig_1_data can be replaced by the version inside Data to obtain all optimized results.
The optimised results can also be generated by first generating the self-energy data with analyse_mock.py, followed by optimising them with mock_optimisation_right.py. This optimisation may take several minutes to complete, and is very sensitive to the Python, NumPy, and SciPy versions. Therefore, the latter output is postprocessed by local_s1.py, which should plot the optimised parameters in local_supfig1v2.pdf in agreement with the values reported in Supplemental Table 2, taking into consideration the reported significant figures.
Different Python/NumPy/SciPy versions may result in different parameter values being explored before reaching convergence. The parameters will generally be in agreement if the printed minus log-probability is below -474, with -487 being the best possible result. If Python is not newer than V3.10 with the latest NumPy/SciPy versions, convergence is not guaranteed. Different Python/NumPy/SciPy versions may result in different parameter values being explored before reaching convergence. The parameters will generally be in agreement if the printed minus log-probability is below -474, with -487 being the best possible result. Upon running with Python older than V3.10 or without the latest compatible NumPy/SciPy versions, convergence is not guaranteed.
For reproducibility purposes, the true Fermi edge fit result has been used, as if taken from a carefully calibrated benchmark edge.
Comparison of the existing local_supfig1v2.pdf with supfig1v2.pdf gives an impression of how the parameters might differ during the optimization for slightly different package versions, while still arriving at the optimised results.

Supp_Fig_2

The data folder Supp_Fig_2_data is initially empty. For the current analyses, STO_2_0010STO_2_.txt must be copied (also available in Data/Supp_Fig_2_data), while the latter folder now also contains C10239BM47fine_a2f.txt, the Eliashberg function from the publication described in the main text, for panel $\textbf{d}$.
STO_2_0010STO_2_.txt must be copied (also available in Data/Supp_Fig_2_data), while the latter folder now also contains C10239BM47fine_a2f.txt, the Eliashberg function from the publication described in the main text, for panel $\textbf{d}$.
The main analysis script here is called tio2_2023_single_complete.py, since in top of the analyses of Figure_5/tio2_2023_single.py, it contains the analysis for the MEM-aided result shown in panel $\textbf{d}$.
The optimised parameters for the default inner/left right branches used in panels $\textbf{a}-\textbf{b}$ for the former and $\textbf{a}-\textbf{c}$ for the latter, as well as the outer right default parameters used in panel $\textbf{c}$ have already been generated in the Figure_5 folder.
The new STO_2_0010STO_2__inner_right_mem.txt used for panel $\textbf{d}$ here, is generated with mem_inner_right.py.

Supp_Fig_3

The data folder Supp_Fig_3_data is initially empty. For the current analyses, once again the input file graphene_raw_cut_2.txt is necessary, while for panel $\textbf{c}$ the data from Usachov et al. (2018) are necessary, found in file graphene_kink_selfs_2.csv. Copy both of them inside the Figures_and_tables/Supp_Fig_3/Supp_Fig_3_data folder.
The Fermi edge corrections of panel $\textbf{a}$ are generated with graphene_edge.py.
The momentum distribution curve maxima with/without the photoemission matrix element corrections of panel $\textbf{b}$ are generated with graphene_linearised.py.
The data for panel $\textbf{c}$ are generated with graphene_cut2_reproduce.py.
The results for panel $\textbf{d}$ are generated with graphene_linearised_uncorrected.py. These are based on new optimizations with left_linear_uncorrected.py and right_linear_uncorrected.py, writing to the respective Supp_Fig_3_data/graphene_raw_cut_2_uncl_left.csv and Supp_Fig_3_data/graphene_raw_cut_2_uncl_right.csv.
Similarly as for the right-hand side shown in Figure 6 $\textbf{b}$, left-hand side shown in panel $\textbf{e}$ was generated with graphene_linearised.py. The optimised parameters are based on Figure_6/left_linear_optimisation.py.
The results for panel $\textbf{f}$ are generated with graphene_linearised_confined.py.

Supp_Fig_4

The placeholder directory Supp_Fig_4_data needs to be replaced by the version from Data to run some postprocessing steps.
Usage of ABINIT and the pseudopotential file Al_pbesol_ncsr_0.4.1.psp8 is described in the supplemental section of the manuscript.
Bulk structural relaxation was performed with fcc_relax.abi.
A representative vacuum thickness convergence calculation file is provided as vacuum_8.abi. Files with $n$ layers of vacuum can be generated by running with Bash:

create_vacuum.sh vacuum_n.abi n

Example Post-processing to determine the work function is done with work_function.py.
Slab thickness convergence file and corresponding band structure files are layer_41.abi and path_m.abi. Files with $n$ where $n$ is odd can be generated by running from the terminal:

create_layer.sh layer_n.abi n

Bulk self-consistent field calculations were done with bulk_bands.abi, while the bands were projected onto $k_z$ with 100_bands.abi.
Band structure postprocessing is done with postproc_bands.py. The extrema determination of the 100 projected bands inside the Supp_Fig_4_data folder may take a minute or so to complete.
Results for Fig 4 panels $\textbf{a}-\textbf{c}$ can be generated with by executing analyse_aluminium.py. The optimized parameters under the header Right-hand side can be obtained by executing optimise_all.py and fix_mb.py. After optimisations that may take a few minutes, the optimised parameters are found at the bottom of Supp_Fig_4_data/raw201004071530_calib_right.txt and Supp_Fig_4_data/raw201004071530_calib_right_fixed.txt, respectively.
For each .abi file, there is a corresponding .abo file in Data/Supp_Fig_4_data that can be inspected for the anticipated calculations being performed, their output values, and runtime.

Table_1

Table 1 can be generated by running mock_optimisation_right.py and retrieving the output parameters at the bottom of the newly created artificial_einstein_right.txt. The script may take several minutes to run.
The absolute values are printed from the final code of postproc_mock.py, which already contains the optimized parameters right after the brnc header.

Table_S1

Table S1 can be generated by modifying in postproc_mock.py the parameters right after the branch name variable brnc, with the table listing rounded deviations for which the extraction will fail. For example, multiplying $m_{\text{R}}^{\text{b}}$ by 1.2 will result in failure, while the code should still complete if it is multiplied by 1.15.
If the parameters are allowed, the table values are printed by final code of postproc_mock.py.

Table_S2

Table S2 is generated similar to Table 1 of the main text, now using mock_optimisation_right.py to execute the Bayesian loop. The loop may take seconds to minutes to either fail or complete, especially for large allowed $h_0$. The table lists rounded ranges for which the code terminates successfully.
The parameters are written to artificial_einstein_right.txt, and upon success, should result in a minus log-probability below -474, with -487 being the best possible result. In case of success, the absolute values are also printed by the bottom lines of code in mock_optimisation_right.py.

Table_S3

Table S3 can be generated by uncommenting the parameter block following the Optimized right header in postproc_mock.py, followed by multiplying the parameters by the allowed deviations in the table. The absolute values are also printed by the bottom of the code.

Table_S4

Table S4 can be filled with values selected from scripts in Supp_Fig_4. The bold KS $m_{\mathrm{R}}^{\mathrm{b}}$ value is printed with postproc_shortened.py, which fits the Kohn-Sham eigenvalues.
The other parameters are printed upon running analyse_aluminium.py. The 5 parameters optimised by the Bayesian loop for the xARPES case and KS case are generated by following up with optimise_all.py and fix_mb.py, respectively.