This repository contains scripts to reproduce results from the paper "Comparing the latent features of universal machine-learning interatomic potentials" (preprint: https://arxiv.org/abs/2512.05717).
├── data/
│ ├── features/ # Generated features
│ └── xyz/ # ASE files with datasets
├── models/
│ ├── envs/ # Conda environments
│ └── models/ # Fetched MLIP checkpoints
├── scripts/
│ ├── data_preprocess/ # Dataset preparation and filtering
│ ├── last-layer-features/ # Feature extraction pipelines
│ ├── cumulants/ # Cumulants experiment
│ ├── dos/ # PET-MAD-DOS experiment
│ ├── fine-tuning/ # Fine-tuning experiment
│ ├── ll_vs_bb/ # Last-layer vs backbone comparison
│ ├── umlips/ # Cross-model MLIP analysis
│ └── variants/ # Model variant comparisons
├── plotting/ # Jupyter notebooks for visualization
├── results/ # Generated figures and error metrics
└── src/ # Utils
First, create all conda environments for the different MLIPs:
cd models/
bash create_envs.sh
Second, download model checkpoints
cd models/
bash get_models.sh
This fetches pre-trained model weights for all MLIPs used in the analysis.
Next, prepare dataset subsets for analysis:
conda activate skmatter
cd scripts/data_preprocess/
python get_consistent_mad_test.py
python get_organic_mad_test.py
python get_consistent_salexandria.py
Generate last-layer features from each model:
cd scripts/last-layer-features/
# Extract features for each MLIP (DPA, MACE, PET, UMA)
# See individual README files in each model directory
Run analysis scripts to compute reconstruction errors:
conda activate skmatter
# Calculate statistical moments
cd scripts/cumulants/
bash run_calculate_cumulant.sh
bash run_errors_model.sh
bash run_errors_umlip.sh
# Other analyses
# ...
Finally, run Jupyter notebooks to re-create figures:
pip install numpy scipy pandas matplotlib seaborn scikit-learn jupyterlab notebook
cd plotting/
jupyter notebook
Generated outputs are stored in the results/ directory:
results/
├── figures/ # Plots
└── reconstruction_errors/ # Quantitative metrics in JSON format
@misc{chorna2025comparinglatentfeaturesuniversal,
title={Comparing the latent features of universal machine-learning interatomic potentials},
author={Sofiia Chorna and Davide Tisi and Cesare Malosso and Wei Bin How and Michele Ceriotti and Sanggyu Chong},
year={2025},
eprint={2512.05717},
archivePrefix={arXiv},
primaryClass={physics.chem-ph},
url={https://arxiv.org/abs/2512.05717},
}