Data-driven discovery of organic electronic materials enabled by hybrid top-down/bottom-up design

doi:10.24435/materialscloud:nh-gb

materialscloud:2023.47

Published March 22, 2023 | Version v2

Dataset Open

Data-driven discovery of organic electronic materials enabled by hybrid top-down/bottom-up design

Blaskovits, J. Terence¹

Laplaza, R.¹

Vela, S.¹

Corminboeuf, C.¹

*

1. Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland

* Contact person

The high-throughput molecular exploration and screening of organic electronic materials often starts with either a 'top-down' mining of existing repositories, or the 'bottom-up' assembly of fragments based on predetermined rules and known synthetic templates. In both instances, the datasets used are often produced on a case-by-case basis, and require the high-quality computation of electronic properties and extensive user input: curation in the top-down approach, and the construction of a fragment library and introduction of rules for linking them in the bottom-up approach. Both approaches are time-consuming and require significant computational resources. Here, we generate a top-down set named FORMED consisting of 117K synthesized molecules containing their optimized structures, associated electronic and topological properties and chemical composition, and use these structures as a vast library of molecular building blocks for bottom-up fragment-based materials design. A tool is developed to automate the coupling of these building block units based on their available Csp2-H bonds, thus providing a fundamental link between the two philosophies of dataset construction. Statistical models are trained on this dataset and a subset of the resulting hybrid top-down/bottom-up compounds (selected dimers), which enable on-the-fly prediction of key ground state (frontier molecular orbital gaps) and excited state (S1 and T1 energies) properties from molecular geometries with high accuracy across all known p-block organic compound space. With access to ab initio-quality optical properties in hand, it is possible to apply this bottom-up pipeline using existing compounds as molecular building blocks to any materials design campaign. To illustrate this, we construct and screen over a million molecular candidates (predicted dimers) for efficient intramolecular singlet fission, the leading candidates of which provide insight into the structural features that may promote this multiexciton-generating process.

Files

File preview

files_description.md

All files

Files (2.1 GiB)

Name	Size
files_description.md md5:e474371fd63e73695f44aed627d35320	1.7 KiB	Preview Download
chemiscopify.ipynb md5:3885bfbfcba50076deb2914be9e52979	36.1 MiB	Preview Download
Data_dimers_predicted.csv md5:ae8916898d920626f346d6c6ba8dd42b	100.4 MiB	Preview Download
Data_dimers_selected.csv md5:4aad2e015bacc3f7be16b63a6678602b	680.7 KiB	Preview Download
Data_FORMED.csv md5:9f31404de41180f603c86027993b8677	95.1 MiB	Preview Download
Data_FORMED_scored.csv md5:7f6c580975810525cffeb8cc63cf173f	116.5 MiB	Preview Download
Data_top_1500_dimers_scored.csv md5:45a5fe3952e6308e0ee7a9add3f0052a	283.1 KiB	Preview Download
Dimers_predicted_chemiscope.json.gz md5:1682a79e2f29fffadffeadb7173a4733	785.9 MiB	Download
Dimers_selected_chemiscope.json.gz md5:547a7fb2245ae0a5ef4d4edd1752c1f8	2.0 MiB	Download
FORMED_chemiscope.json.gz md5:05236f475f8c01672bc313480df7a549	89.7 MiB	Download
README.txt md5:9cfe7467dc8b61f90bcdac45a6174ddb	984 Bytes	Preview Download
XYZ_dimers_predicted.tar.gz md5:2f54d2274ed5fe3ceda5027ea7d567a9	855.0 MiB	Download
XYZ_dimers_selected.tar.gz md5:20789d5a174f5fa27cd0226c5ca2ffa8	2.3 MiB	Download
XYZ_FORMED.tar.gz md5:584c00f6fbd6d56b0055685938848654	94.8 MiB	Download

References

Journal reference (Manuscript to be submitted. Reference will be updated shortly.)
J. T. Blaskovits, R. Laplaza, S. Vela, C. Corminboeuf, To be submitted (2022)

	All versions	This version
Views	1,166	381
Downloads	576	115
Data volume	59.9 GiB	12.4 GiB

Data-driven discovery of organic electronic materials enabled by hybrid top-down/bottom-up design

Creators

Description

Files

File preview

files_description.md

All files

Files (2.1 GiB)

References