This record has versions v1, v2. This is version v1.

Recommended by

Indexed by

Ranking the synthesizability of hypothetical zeolites with the sorting hat

Benjamin A. Helfrecht1, Giovanni Pireddu2, Rocio Semino3, Scott M. Auerbach4*, Michele Ceriotti1*

1 Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

2 PASTEUR, Département de Chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 24 Rue Lhomond, 75005 Paris, France

3 ICGM, Université de Montpellier, CNRS, ENSCM, Montpellier, France

4 Department of Chemistry and Department of Chemical Engineering, University of Massachusetts, Amherst, Amherst, Massachusetts 01003, USA

* Corresponding authors emails:,
DOI10.24435/materialscloud:sd-j6 [version v1]

Publication date: Jun 10, 2022

How to cite this record

Benjamin A. Helfrecht, Giovanni Pireddu, Rocio Semino, Scott M. Auerbach, Michele Ceriotti, Ranking the synthesizability of hypothetical zeolites with the sorting hat, Materials Cloud Archive 2022.72 (2022), doi: 10.24435/materialscloud:sd-j6.


Zeolites are nanoporous alumino-silicate frameworks widely used as catalysts and adsorbents. Even though millions of siliceous networks can be generated by computer-aided searches, no new hypothetical framework has yet been synthesized. The needle-in-a-haystack problem of finding promising candidates among large databases of predicted structures has intrigued materials scientists for decades; yet, most work to date on the zeolite problem has been limited to intuitive structural descriptors. Here, we tackle this problem through a rigorous data science scheme—the “zeolite sorting hat”—that exploits interatomic correlations to discriminate between real and hypothetical zeolites and to partition real zeolites into compositional classes that guide synthetic strategies for a given hypothetical framework. We find that, regardless of the structural descriptor used by the zeolite sorting hat, there remain hypothetical frameworks that are incorrectly classified as real ones, suggesting that they might be good candidates for synthesis. We seek to minimize the number of such misclassified frameworks by using as complete a structural descriptor as possible, thus focusing on truly viable synthetic targets, while discovering structural features that distinguish real and hypothetical frameworks as an output of the zeolite sorting hat. Further ranking of the candidates can be achieved based on thermodynamic stability and/or their suitability for the desired applications. Based on this workflow, we propose three hypothetical frameworks differing in their molar volume range as the top targets for synthesis, each with a composition suggested by the zeolite sorting hat. Finally, we analyze the behavior of the zeolite sorting hat with a hierarchy of structural descriptors including intuitive descriptors reported in previous studies, finding that intuitive descriptors produce significantly more misclassified hypothetical frameworks, and that more rigorous interatomic correlations point to second-neighbor Si-O distances around 3.2–3.4 Å as the key discriminatory factor.

Materials Cloud sections using this data

No Explore or Discover sections associated with this archive record.


File name Size Description
631.0 MiB Input data for the machine learning workflow and a visualization of the most promising synthesis candidates


Files and data are licensed under the terms of the following license: Creative Commons Attribution 4.0 International.
Metadata, except for email addresses, are licensed under the Creative Commons Attribution Share-Alike 4.0 International license.

External references

Preprint (Paper for which the data were generated)
Journal reference (Paper in which the subset of 10,000 structures from the Deem database was originally described)
B. A. Helfrecht, R. Semino, G. Pireddu, S. M. Auerbach, M. Ceriotti, J. Chem. Phys. 151, 154112 (2019) doi:10.1063/1.5119751
Journal reference (Paper in which the Deem database of hypothetical zeolites was originally created, described, and used)
R. Pophale, P. A. Cheeseman, M. W. Deem, Phys. Chem. Chem. Phys. 13, 12407-12412 (2011). doi:10.1039/C0CP02255A
Website (Archive of the original Deem database of hypothetical zeolite structures)
M. W. Deem, Michael Deem's PCOD and PCOD2 databases of zeolitic structures [Zenodo data set] (2020). doi:10.5281/zenodo.4030232


MARVEL EPFL ERC H2020 machine learning zeolites

Version history:

2022.129 (version v2) Oct 25, 2022 DOI10.24435/materialscloud:xw-5k
2022.72 (version v1) [This version] Jun 10, 2022 DOI10.24435/materialscloud:sd-j6