Indexed by

Bias free multiobjective active learning for materials design and discovery

Kevin Maik Jablonka1*, Giriprasad Melpatti Jothiappan2, Shefang Wang2, Berend Smit1*, Brian Yoo2*

1 Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École Polytechnique Fédérale de Lausanne (EPFL), CH-1951 Sion, Valais, Switzerland

2 BASF Corporation, 540 White Plains Road, Tarrytown, New York, 10591, USA

* Corresponding authors emails: kevin.jablonka@epfl.ch, berend.smit@epfl.ch, brian.yoo@basf.com
DOI10.24435/materialscloud:8m-6d [version v1]

Publication date: Feb 22, 2021

How to cite this record

Kevin Maik Jablonka, Giriprasad Melpatti Jothiappan, Shefang Wang, Berend Smit, Brian Yoo, Bias free multiobjective active learning for materials design and discovery, Materials Cloud Archive 2021.34 (2021), doi: 10.24435/materialscloud:8m-6d.


The design rules for materials are clear for applications with a single objective. For most applications, however, there are often multiple, sometimes competing objectives where there is no single best material, and the design rules change to finding the set of Pareto optimal materials. In this work, we introduce an active learning algorithm that directly uses the Pareto dominance relation to compute the set of Pareto optimal materials with desirable accuracy. We apply our algorithm to de novo polymer design with a prohibitively large search space. Using molecular simulations, we compute key descriptors for dispersant applications and reduce the number of materials that need to be evaluated to reconstruct the Pareto front with a desired confidence by over 98% compared to random search. This work showcases how simulation and machine learning techniques can be coupled to discover materials within a design space that would be intractable using conventional screening approaches.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
430.5 KiB Features and labels for machine learning (zipped folder of csv files)
416.6 MiB LAMMPS input files for the calculation of the radii of gyration.
1.1 KiB Detailed description of the filecontents.
3.4 GiB LAMMPS and SSAGES input files for the calculation of the dimer free energy.
3.4 GiB LAMMPS and SSAGES input files for the calculation of the free energy of adsorption.


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Software (Script that can be used to reproduce the main results.)
Software (General-purpose implementation of the active learning algorithm.)
Preprint (Preprint where the data is discussed.)


MARVEL ERC SNSF machine learning polymers multiobjective active learning

Version history:

2021.34 (version v1) [This version] Feb 22, 2021 DOI10.24435/materialscloud:8m-6d