Published November 18, 2025 | Version v2
Dataset Open

Adaptive pruning for increased robustness and reduced computational overhead in Gaussian process accelerated saddle point searches

  • 1. ROR icon University of Iceland
  • 2. ROR icon École Polytechnique Fédérale de Lausanne

* Contact person

Description

Gaussian process (GP) regression provides a strategy for accelerating saddle point searches on high-dimensional energy surfaces by reducing the number of times the energy and its derivatives with respect to atomic coordinates need to be evaluated. The computational overhead in the hyperparameter optimization can, however, be large and make the approach inefficient. Failures can also occur if the search ventures too far into regions that are not represented well enough by the GP model. Here, these challenges are resolved by using geometry-aware optimal transport measures and an active pruning strategy using a summation over Wasserstein-1 distances for each atom-type in farthest-point sampling, selecting a fixed-size subset of geometrically diverse configurations to avoid rapidly increasing cost of GP updates as more observations are made. Stability is enhanced by permutation-invariant metric that provides a reliable trust radius for early-stopping and a logarithmic barrier penalty for the growth of the signal variance. These physically motivated algorithmic changes prove their efficacy by reducing to less than a half the mean computational time on a set of 238 challenging configurations from a previously published data set of chemical reactions. With these improvements, the GP approach is established as a robust and scalable algorithm for accelerating saddle point searches when the evaluation of the energy and atomic forces requires significant computational effort.

 

This record contains the complete traces of dimer saddle search runs with the OT-GP (optimal transport GP) framework. This includes STDOUT and HDF5 trajectories. The record is a companion to the code in the associated GitHub repository and can be used to regenerate the figures and validate the analysis in the accompanying manuscript.

Files

File preview

All files

Files (388.6 MiB)

Name Size
md5:e216cd25edd70575d8263b396d4c3c30
1.9 MiB Download
md5:2587f652d3cc1242ef62b823eca0175c
67.2 MiB Download
md5:b1eadb66d4f56c76746a8aca95cb50b1
5.4 KiB Preview Download
md5:a2f3fa178650f09a7fa6a8a82e944557
319.5 MiB Download

References

Software (Software collection used to generate the models and data in this record.)
R. Goswami, Github

Journal reference (Publication describing the method and data.)
R. Goswami, H. Jónsson, ChemPhysChem (2025) - submitted

Preprint (Preprint describing the method and data.)
R. Goswami and H. Jónsson, "Adaptive pruning for increased robustness and reduced computational overhead in gaussian process accelerated saddle point searches," Oct. 07, 2025, arXiv: arXiv:2510.06030. doi: 10.48550/arXiv.2510.06030., doi: 10.48550/arXiv.2510.06030