Dictionary of 140k GDB and ZINC derived AMONs


JSON Export

{
  "revision": 5, 
  "id": "819", 
  "created": "2021-04-10T08:27:41.224956+00:00", 
  "metadata": {
    "doi": "10.24435/materialscloud:1s-51", 
    "status": "published", 
    "title": "Dictionary of 140k GDB and ZINC derived AMONs", 
    "mcid": "2021.61", 
    "license_addendum": null, 
    "_files": [
      {
        "description": "README file", 
        "key": "README.md", 
        "size": 2983, 
        "checksum": "md5:a84a7b7306bc700b92254f3b47c19041"
      }, 
      {
        "description": "SMILES strings of AMONs unique to ZINC", 
        "key": "zinc.can", 
        "size": 1265017, 
        "checksum": "md5:2ae912085b1226832a2734e723b5cf3b"
      }, 
      {
        "description": "SMILES strings of AMONs unique to GDB17", 
        "key": "gdb17.can", 
        "size": 177166, 
        "checksum": "md5:cb8994a40f3aa64b778b33d2dcecd68a"
      }, 
      {
        "description": "SMILES strings of AMONs shared by ZINC & GDB17\t\ufffc\ufffc", 
        "key": "gdb17-zinc-comm.can", 
        "size": 353455, 
        "checksum": "md5:f7311fd7a704b54e830829195335f691"
      }, 
      {
        "description": "json files containing geometry and properties for all ZINC AMONs", 
        "key": "zinc.tar.gz", 
        "size": 665381412, 
        "checksum": "md5:90b07d3d60018e6ee6cfec0bd90c36a9"
      }, 
      {
        "description": "json files containing geometry and properties for all GDB17 AMONs", 
        "key": "gdb17.tar.gz", 
        "size": 270819891, 
        "checksum": "md5:be6cc52e6d6308a4e05f310bc19de083"
      }
    ], 
    "owner": 366, 
    "_oai": {
      "id": "oai:materialscloud.org:819"
    }, 
    "keywords": [
      "building blocks", 
      "quantum machine learning", 
      "organic chemical space", 
      "SNSF", 
      "MARVEL", 
      "ERC"
    ], 
    "conceptrecid": "818", 
    "is_last": true, 
    "references": [
      {
        "type": "Preprint", 
        "url": "https://arxiv.org/abs/2008.05260", 
        "citation": "B. Huang, O. A. von Lilienfeld, arXiv:2008.05260 [physics.chem-ph]"
      }
    ], 
    "publication_date": "Apr 11, 2021, 12:39:48", 
    "license": "Creative Commons Attribution 4.0 International", 
    "id": "819", 
    "description": "We present all AMONs for GDB and Zinc data-bases using no more than 7 non-hydrogen atoms (AGZ7)---a calculated organic chemistry building-block dictionary based on the AMON approach [Huang and von Lilienfeld, Nature Chemistry (2020)]. AGZ7 records Cartesian coordinates of compositional and constitutional isomers, as well as properties for \u223c140k small organic molecules obtained by systematically fragmenting all molecules of Zinc and the majority of GDB17 into smaller entities, saturating with hydrogens, and containing no more than 7 heavy atoms (excluding hydrogen atoms). AGZ7 cover the elements H, B, C, N, O, F, Si, P, S, Cl, Br, Sn and I and includes optimized geometries, total energy and its decomposition, Mulliken atomic charges, dipole moment vectors, quadrupole tensors, electronic spatial extent, eigenvalues of all occupied orbitals, LUMO, gap, isotropic polarizability, harmonic frequencies, reduced masses, force constants, IR intensity, normal coordinates, rotational constants, zero-point energy, internal energy, enthalpy, entropy, free energy, and heat capacity (all at ambient conditions) using B3LYP/cc-pVTZ (pseudopotentials were used for Sn and I) level of theory. We exemplify the usefulness of this data set with AMON based machine learning models of total potential energy predictions of seven of the most rigid GDB-17 molecules.", 
    "version": 1, 
    "contributors": [
      {
        "email": "hbdft2008@gmail.com", 
        "affiliations": [
          "Department of Chemistry, University of Basel, CH-4056 Basel, Switzerland", 
          "Faculty of Physics, University of Vienna, 1090 Wien, Austria"
        ], 
        "familyname": "Huang", 
        "givennames": "Bing"
      }, 
      {
        "email": "anatole.vonlilienfeld@gmail.com", 
        "affiliations": [
          "Department of Chemistry, University of Basel, CH-4056 Basel, Switzerland", 
          "Faculty of Physics, University of Vienna, 1090 Wien, Austria"
        ], 
        "familyname": "von Lilienfeld", 
        "givennames": "Anatole"
      }
    ], 
    "edited_by": 100
  }, 
  "updated": "2021-12-06T14:13:17.113012+00:00"
}