Publication date: Aug 10, 2021
The application of machine learning to theoretical chemistry has made it possible to combine the accuracy of quantum chemical energetics with the thorough sampling of finite-temperature fluctuations. To reach this goal, a diverse set of methods has been proposed, ranging from simple linear models to kernel regression and highly nonlinear neural networks. Here we apply two widely different approaches to the same, challenging problem - the sampling of the conformational landscape of polypeptides at finite temperature. We develop a Local Kernel Regression (LKR) coupled with a supervised sparsity method and compare it with a more established approach based on Behler-Parrinello type Neural Networks. In the context of the LKR, we discuss how the supervised selection of the reference pool of environments is crucial to achieve accurate potential energy surfaces at a competitive computational cost and leverage the locality of the model to infer which chemical environments are poorly described by the DFTB baseline. We then discuss the relative merits of the two frameworks and perform Hamiltonian-reservoir replica-exchange Monte Carlo sampling and metadynamics simulations, respectively, to demonstrate that both frameworks can achieve converged and transferable sampling of the conformational landscape of complex and flexible biomolecules with comparable accuracy and computational cost.
No Explore or Discover sections associated with this archive record.
|126.1 MiB||Compressed tar ball containing the molecular structures and energies necessary to reproduce the results in the manuscript|