Ranking the synthesizability of hypothetical zeolites with the sorting hat


JSON Export

{
  "revision": 7, 
  "metadata": {
    "publication_date": "Jun 10, 2022, 08:38:28", 
    "_oai": {
      "id": "oai:materialscloud.org:1372"
    }, 
    "license": "Creative Commons Attribution 4.0 International", 
    "description": "Zeolites are nanoporous alumino-silicate frameworks widely used as catalysts and adsorbents. Even though millions of siliceous networks can be generated by computer-aided searches, no new hypothetical framework has yet been synthesized. The needle-in-a-haystack problem of finding promising candidates among large databases of predicted structures has intrigued materials scientists for decades; yet, most work to date on the zeolite problem has been limited to intuitive structural descriptors. Here, we tackle this problem through a rigorous data science scheme\u2014the \u201czeolite sorting hat\u201d\u2014that exploits interatomic correlations to discriminate between real and hypothetical zeolites and to partition real zeolites into compositional classes that guide synthetic strategies for a given hypothetical framework. We find that, regardless of the structural descriptor used by the zeolite sorting hat, there remain hypothetical frameworks that are incorrectly classified as real ones, suggesting that they might be good candidates for synthesis. We seek to minimize the number of such misclassified frameworks by using as complete a structural descriptor as possible, thus focusing on truly viable synthetic targets, while discovering structural features that distinguish real and hypothetical frameworks as an output of the zeolite sorting hat. Further ranking of the candidates can be achieved based on thermodynamic stability and/or their suitability for the desired applications. Based on this workflow, we propose three hypothetical frameworks differing in their molar volume range as the top targets for synthesis, each with a composition suggested by the zeolite sorting hat. Finally, we analyze the behavior of the zeolite sorting hat with a hierarchy of structural descriptors including intuitive descriptors reported in previous studies, finding that intuitive descriptors produce significantly more misclassified hypothetical frameworks, and that more rigorous interatomic correlations point to second-neighbor Si-O distances around 3.2\u20133.4 \u00c5 as the key discriminatory factor.", 
    "contributors": [
      {
        "familyname": "Helfrecht", 
        "affiliations": [
          "Laboratory of Computational Science and Modeling, Institut des Mat\u00e9riaux, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
        ], 
        "givennames": "Benjamin A."
      }, 
      {
        "familyname": "Pireddu", 
        "affiliations": [
          "PASTEUR, D\u00e9partement de Chimie, \u00c9cole Normale Sup\u00e9rieure, PSL University, Sorbonne Universit\u00e9, CNRS, 24 Rue Lhomond, 75005 Paris, France"
        ], 
        "givennames": "Giovanni"
      }, 
      {
        "familyname": "Semino", 
        "affiliations": [
          "ICGM, Universit\u00e9 de Montpellier, CNRS, ENSCM, Montpellier, France"
        ], 
        "givennames": "Rocio"
      }, 
      {
        "familyname": "Auerbach", 
        "affiliations": [
          "Department of Chemistry and Department of Chemical Engineering, University of Massachusetts, Amherst, Amherst, Massachusetts 01003, USA"
        ], 
        "email": "auerbach@umass.edu", 
        "givennames": "Scott M."
      }, 
      {
        "familyname": "Ceriotti", 
        "affiliations": [
          "Laboratory of Computational Science and Modeling, Institut des Mat\u00e9riaux, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
        ], 
        "email": "michele.ceriotti@epfl.ch", 
        "givennames": "Michele"
      }
    ], 
    "edited_by": 576, 
    "title": "Ranking the synthesizability of hypothetical zeolites with the sorting hat", 
    "conceptrecid": "1371", 
    "license_addendum": null, 
    "doi": "10.24435/materialscloud:sd-j6", 
    "mcid": "2022.72", 
    "_files": [
      {
        "size": 661644669, 
        "key": "archive.tar.gz", 
        "checksum": "md5:9b02cad5aa88baffe6af9f4ff070f2c1", 
        "description": "Input data for the machine learning workflow and a visualization of the most promising synthesis candidates"
      }
    ], 
    "id": "1372", 
    "keywords": [
      "MARVEL", 
      "EPFL", 
      "ERC", 
      "H2020", 
      "machine learning", 
      "zeolites"
    ], 
    "is_last": true, 
    "status": "published", 
    "references": [
      {
        "doi": "https://doi.org/10.48550/arXiv.2110.13764", 
        "url": "https://arxiv.org/abs/2110.13764", 
        "comment": "Paper for which the data were generated", 
        "type": "Preprint", 
        "citation": "B. A. Helfrecht, G. Pireddu, R. Semino, S. M. Auerbach, M. Ceriotti. arXiv:2110.13764v1 (2021)"
      }, 
      {
        "doi": "10.1063/1.5119751", 
        "comment": "Paper in which the subset of 10,000 structures from the Deem database was originally described", 
        "type": "Journal reference", 
        "citation": "B. A. Helfrecht, R. Semino, G. Pireddu, S. M. Auerbach, M. Ceriotti, J. Chem. Phys. 151, 154112 (2019)"
      }, 
      {
        "doi": "10.1039/C0CP02255A", 
        "comment": "Paper in which the Deem database of hypothetical zeolites was originally created, described, and used", 
        "type": "Journal reference", 
        "citation": "R. Pophale, P. A. Cheeseman, M. W. Deem, Phys. Chem. Chem. Phys. 13, 12407-12412 (2011)."
      }, 
      {
        "doi": "10.5281/zenodo.4030232", 
        "comment": "Archive of the original Deem database of hypothetical zeolite structures", 
        "type": "Website", 
        "citation": "M. W. Deem, Michael Deem's PCOD and PCOD2 databases of zeolitic structures [Zenodo data set] (2020)."
      }
    ], 
    "version": 1, 
    "owner": 28
  }, 
  "id": "1372", 
  "created": "2022-06-02T14:28:07.094815+00:00", 
  "updated": "2022-06-10T06:38:28.060288+00:00"
}