Score-based diffusion models for accurate crystal-structure inpainting and reconstruction of hydrogen positions
- 1. PSI Center for Scientific Computing, Theory and Data, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
- 2. Dipartimento di Scienze Matematiche, Fisiche e Informatiche, Universit`a di Parma, I-43124 Parma, Italy
- 3. Dipartimento di Scienze Fisiche, Informatiche e Matematiche, Universit`a degli Studi di Modena e Reggio Emilia, Modena, Italy
- 4. Centro S3, Istituto Nanoscienze-CNR, Modena, Italy
* Contact person
Description
Generative AI models, such as score-based diffusion models, have recently advanced the field of computational materials science by enabling the generation of new materials with desired properties. In addition, these models can also be leveraged to reconstruct crystal structures for which partial information is available. One relevant example is the reliable determination of atomic positions occupied by hydrogen atoms in hydrogen-containing crystalline materials. While crucial to the analysis and prediction of many materials properties, the identification of hydrogen positions can however be difficult and expensive, as it is challenging in X-ray scattering experiments and often requires dedicated neutron scattering measurements. As a consequence, inorganic crystallographic databases frequently report lattice structures where hydrogen atoms have been either omitted or inserted with heuristics or by chemical intuition. Here, we combine diffusion models from the field of materials science with techniques originally developed in computer vision for image inpainting. We present how this knowledge transfer across domains enables a much faster and more accurate completion of host structures, compared to unconditioned diffusion models or previous approaches solely based on DFT. Overall, our approach exceeds a success rate of 97% in terms of finding a structural match or predicting a more stable configuration than the initial reference, when starting both from structures that were already relaxed with DFT, or directly from the experimentally determined host structures.
Files
File preview
README.md
All files
Files
(27.9 GiB)
| Name | Apps | Size | |
|---|---|---|---|
|
md5:4766c3646f15d103842ff86d14b22e8d
|
101.3 MiB | Preview Download | |
|
md5:49478edb715e897bc74d2732624303cf
|
|
22.0 GiB | Download |
|
md5:076e656461d7603d5fba74a980530571
|
|
3.8 GiB | Download |
|
md5:92fc5388a9f9d11e5f5c99b46e97a7d3
|
|
1.9 GiB | Download |
|
md5:9e70f2cf6bbf15d6f0989a447d266c53
|
10.7 KiB | Preview Download |
References
Preprint T. Reents, A. Cantarella, M. Bercx, P. Bonfà and G. Pizzi, arXiv preprint arXiv:2601.01959 (2026), doi: 10.48550/arXiv.2601.01959
Software (Model checkpoints for the models discussed in the paper.) Hugging Face repository: t-reents/XtalPaint
Software (Code package related to the paper) XtalPaint