Published January 10, 2025 | Version v1
Dataset Open

HTA - An open-source software for assigning heads and tails to SMILES in polymerization reactions

  • 1. University of Jyväskylä - Department of Chemistry - Seminaarinkatu 15, PL 35, 40014 Jyväskylän yliopisto - Finland
  • 2. Rd J Fco Aguirre Proença Km 9 Sp101. Chacara Assay Hortolândia, SP 13186-900, Brazil
  • 3. IBM Research Brazil - Avenida República do Chile, 330 - 11o. e 12. andares Rio De Janeiro, RJ 20031-170, Brazil

* Contact person

Description

Polymers are versatile materials with a wide range of applications. The improvement of polymer properties rises the importance on the way that the repeating units are connected (head-to-tail,head-to-head,tail-to-tail) to build the polymer structure since it directly influences the morphology, chain topology and consequently its properties. Artificial intelligence (AI) based approaches are beginning to impact several domains of human life, science and technology. Polymer informatics is one such domain where AI and machine learning (ML) tools are being used in the efficient development, design and discovery of polymer. One key enabling factor for the essential foundations for Polymer Informatics is the machine-readable polymer representation. Polymer have been represented in a string format with special characters used to tag the head and tail positions indicating where the linking bond happens between repeat units. Available tools to assign the head and tail position limits its applicability in a broad sense. In this work we show a new tool to assign the head and tail atoms for a given monomer. From a database of 206 polymer precursors curated from the literature, our algorithm correctly predicted the class of 201 data points, which represents 97.6% of accuracy and regarding the the head and tail assignment, correctly assigned the positions for 188 data points, which translates to 91.3% of accuracy.

Files

File preview

files_description.md

All files

Files (49.8 KiB)

Name Size
md5:2946167aafb659b8b11d648c5085c0e6
534 Bytes Preview Download
md5:df1241b40f9b0329dfd5ad070754d94e
17.4 KiB Preview Download
md5:32007eea133418e68723f6684626fefa
31.8 KiB Preview Download

References

Journal reference (Paper in which the method is described and the data is discussed)
B. S. Ferrari, R. Giro and M.B. Steiner, Journal of Chemical Theory and Computation - submitted

Preprint
B. S. Ferrari, R. Giro and M.B. Steiner - submitted to Chemrxiv