This record has versions v1, v2, v3, v4. This is version v1.

Recommended by

Indexed by

Simulating solvation and acidity in complex mixtures with first-principles accuracy: the case of CH₃SO₃H and H₂O₂ in phenol

Kevin Rossi1*, Veronika Juraskova2*, Raphael Wischert3, Laurent Garel4, Clemence Corminboeuf2*, Michele Ceriotti1*

1 Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, 1015, Switzerland

2 Laboratory for Computational Molecular Design (LCMD), Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, 1015, Switzerland

3 Eco-Efficient Products and Processes Laboratory, Solvay, RIC Shanghai, China

4 Aroma Performance Laboratory, Solvay, RIC Lyon, France

* Corresponding authors emails:,,,
DOI10.24435/materialscloud:z9-zr [version v1]

Publication date: Jun 22, 2020

How to cite this record

Kevin Rossi, Veronika Juraskova, Raphael Wischert, Laurent Garel, Clemence Corminboeuf, Michele Ceriotti, Simulating solvation and acidity in complex mixtures with first-principles accuracy: the case of CH₃SO₃H and H₂O₂ in phenol, Materials Cloud Archive 2020.64 (2020),


Set of inputs to perform the calculations reported in the paper. The i-pi input enables to perform molecular dynamics / metadynamics / REMD / PIMD simulations, with adequate thermostats. The DFTB and LAMMPS input respectively enable to calculate force and energies within the DFTB and Neural Network Forcefield frameworks. The CP2K input files enable to calculate force and energies at PBE and PBE0 level. The latter is used as the reference to train the neural network correction on top of DFTB. Brief description of the work: We present a generally-applicable computational framework for the efficient and accurate characterization of molecular structural patterns and acid properties in explicit solvent using H₂O₂ and CH₃SO₃H in phenol as an example. In order to address the challenges posed by the complexity of the problem, we resort to a set of data-driven methods and enhanced sampling algorithms. The synergistic application of these techniques makes the first-principle estimation of the chemical properties feasible without renouncing to the use of explicit solvation, involving extensive statistical sampling. Ensembles of neural network potentials are trained on a set of configurations carefully selected out of preliminary simulations performed at a low-cost density-functional tight-binding (DFTB) level. Energy and forces of these configurations are then recomputed at the hybrid density functional theory (DFT) level and used to train the neural networks. The stability of the NN model is enhanced by using DFTB energetics as a baseline, but the efficiency of the direct NN (i.e., baseline-free) is exploited via a multiple-time step integrator. The neural network potentials are combined with enhanced sampling techniques, such as replica exchange and metadynamics, and used to characterize the relevant protonated species and dominant non-covalent interactions in the mixture, also considering nuclear quantum effects.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
3.7 KiB i-pi input to run basic MD (
1.5 KiB example PBE input for DFT calculations (
3.0 KiB exemple PBE0 input for DFT calculations (
1.2 KiB example DFTB+ input for DFTB calculations (
22.6 KiB example input.nn input to train and use a neural network for force and energy predictions (
3.5 KiB example LAMMPS input for MD calculations via i-pi and using neural network potentials (


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.


machine learning solution chemistry acid homogeneous catalysis catalysis acid artificial intelligence reaction