Published December 19, 2025 | Version v1
Dataset Open

A universal machine learning model for the electronic density of states

  • 1. ROR icon École Polytechnique Fédérale de Lausanne

* Contact person

Description

In the last few years several ``universal'' interatomic potentials have appeared, using machine-learning approaches to predict energy and forces of atomic configurations with arbitrary composition and structure, with an accuracy often comparable with that of the electronic-structure calculations they are trained on. Here we demonstrate that these generally-applicable models can also be built to predict explicitly the electronic structure of materials and molecules. We focus on the electronic density of states (DOS), and develop PET-MAD-DOS, a rotationally unconstrained transformer model built on the Point Edge Transformer (PET) architecture, and trained on the Massive Atomistic Diversity (MAD) dataset. We demonstrate our model's predictive abilities on samples from diverse external datasets, showing also that the DOS can be further manipulated to obtain accurate band gap predictions. A fast evaluation of the DOS is especially useful in combination with molecular simulations probing matter in finite-temperature thermodynamic conditions. To assess the accuracy of PET-MAD-DOS in this context, we evaluate the ensemble-averaged DOS and the electronic heat capacity of three technologically relevant systems: lithium thiophosphate (LPS), gallium arsenide (GaAs), and a high entropy alloy (HEA). By comparing with bespoke models, trained exclusively on system-specific datasets, we show that our universal model achieves semi-quantitative agreement for all these tasks. Furthermore, we demonstrate that fine-tuning can be performed using a small fraction of the bespoke data, yielding models that are comparable to, and sometimes better than, fully-trained bespoke models. 

Files

File preview

All files

Files (36.5 GiB)

Name Size
md5:7bb9e3faab59f427ba64a3f9bd739134
1.1 GiB Download
md5:636c26cc06633081e7cb7ef27bbdfc7e
2.3 GiB Download
md5:19a2dcc36d0f80eb31ea3661377233e4
2.4 GiB Download
md5:3c6d317fba93abb2cb2a20ed5ef47281
2.2 GiB Download
md5:18e20e1c24c04ac14b36dead96e71ff1
2.4 GiB Download
md5:89e062ebcf21a6952ea74f80de0e8ca8
2.3 GiB Download
md5:a5af421abb47b0bacdbf8f86f1bae13f
2.2 GiB Download
md5:a96937eb3e079fba02ae21015ab7f644
75.2 MiB Download
md5:754b9c0040d9c70699c1aa49622d8cd9
4.8 GiB Download
md5:00f87d6d4344272f9c218091be588ba8
2.3 GiB Download
md5:c5506f4b37ca2affb0d951b5129de503
4.8 GiB Download
md5:7f46de47380f3aae2d232cb5430d4869
4.8 GiB Download
md5:a7d61d4afe4fe5dee9bf0cf43fe7f694
2.2 GiB Download
md5:18ca943140523c66720817bd50d2fdf7
786 Bytes Download
md5:bfc5bd2e61dbff53150f4e298b7982eb
2.3 GiB Download
md5:2a6488bfa2af11dd07b6562199a1ee4b
5.5 KiB Preview Download
md5:5b9d1cd3466b58fa031f893b14981ad7
6.5 KiB Download

Funding

MARVEL/P2 – Machine Learning Platform for Molecules and Materials pillar2
NCCR MARVEL
FIAMMA 101001890
European Research Council (ERC)
214879 20020
Swiss National Science Foundation

References

Preprint (Preprint where the data is discussed)
WB How, P. Febrer, S. Chong, A. Mazitov, F. Bigi, M. Kellner, S. Pozdnyakov, M. Ceriotti arXiv (2025), doi: 10.48550/arXiv.2508.17418