Published February 26, 2025 | Version v2
Dataset Open

Machine learning on multiple topological materials datasets

  • 1. UCLouvain, Institut de la Matiere Condensée et des Nanosciences (IMCN), Chemin des Étoiles 8, Louvain-la-Neuve 1348, Belgium
  • 2. Beijing National Laboratory for Condensed Matter Physics and Institute of Physics,Chinese Academy of Sciences, Beijing, China

* Contact person

Description

A dataset of 35,608 materials with their topological properties is constructed by combining the density functional theory (DFT) results of Materiae and the Topological Materials Database. Thanks to this, machine-learning approaches are developed to categorize materials into five distinct topological types, with the XGBoost model achieving an impressive 85.2% classification accuracy. By conducting generalization tests on different sub-datasets, differences are identified between the original datasets in terms of topological types, chemical elements, unknown magnetic compounds, and feature space coverage. Their impact on model performance is analyzed. Turning to the simpler binary classification between trivial insulators and nontrivial topological materials, three different approaches are also tested. Key characteristics influencing material topology are identified, with the maximum packing efficiency and the fraction of p valence electrons being highlighted as critical features.

Files

File preview

files_description.md

All files

Files (637.2 MiB)

Name Apps Size
md5:c1ea3e3515d807ab9e73f63dbefd3ea1
473 Bytes Preview Download
md5:2859bad2900c0ad1057e8b3d357f452e
101.8 MiB Download
md5:2b92ac74ef6a143d4ca43807b007da2a
237 Bytes Download
md5:d9c6901e2781b55382233974c5360bfc
19.8 MiB Download
md5:e97367839d6f923849f6ab81ff7830f8
48.8 KiB Preview Download
md5:57066e24446f9f128c63a8e1698880de
515.6 MiB Download

References

Website
Yuqing He et al., Machine learning on multiple topological materials datasets

Journal reference (Paper describing the work performed)
Yuqing He at al., npj Computational Materials XX, XX (2025), doi: 10.1038/s41524-025-01687-2