Indexed by

On-the-Fly Active Learning of Interpretable Bayesian Force Fields for Atomistic Rare Events

Jonathan Vandermause1*, Steven B. Torrisi2, Simon Batzner1, Yu Xie1, Lixin Sun1, Alexie M. Kolpak3, Boris Kozinsky1*

1 John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA

2 Department of Physics, Harvard University, Cambridge, MA 02138, USA

3 Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

* Corresponding authors emails: jonathan_vandermause@g.harvard.edu, bkoz@g.harvard.edu
DOI10.24435/materialscloud:2020.0017/v1 [version v1]

Publication date: Jan 28, 2020

How to cite this record

Jonathan Vandermause, Steven B. Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M. Kolpak, Boris Kozinsky, On-the-Fly Active Learning of Interpretable Bayesian Force Fields for Atomistic Rare Events, Materials Cloud Archive 2020.0017/v1 (2020), doi: 10.24435/materialscloud:2020.0017/v1.


Machine learned force fields typically require manual construction of training sets consisting of thousands of first principles calculations, which can result in low training efficiency and unpredictable errors when applied to structures not represented in the training set of the model. This severely limits the practical application of these models in systems with dynamics governed by important rare events, such as chemical reactions and diffusion. We present an adaptive Bayesian inference method for automating the training of interpretable, low-dimensional, and multi-element interatomic force fields using structures drawn on the fly from molecular dynamics simulations. Within an active learning framework, the internal uncertainty of a Gaussian process regression model is used to decide whether to accept the model prediction or to perform a first principles calculation to augment the training set of the model. The method is applied to a range of single- and multi-element systems and shown to achieve a favorable balance of accuracy and computational efficiency, while requiring a minimal amount of ab initio training data. We provide a fully open-source implementation of our method, as well as a procedure to map trained models to computationally efficient tabulated force fields.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
926 Bytes README file describing the contents of the Models directory.
1.2 GiB Input and output files for the nine on-the-fly training simulations described in the main text (Figs. 3-6), as well as the final trained Gaussian process models. The models are stored as both pickled Python objects and .json files, and can be loaded with the open-source FLARE code (available at https://github.com/mir-group/flare).


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.


machine learning molecular dynamics Bayesian inference

Version history:

2020.0017/v1 (version v1) [This version] Jan 28, 2020 DOI10.24435/materialscloud:2020.0017/v1