Publication date: Oct 16, 2024
In recent years, there has been a surge of interest in predicting computed activation barriers, to enable the acceleration of the automated exploration of reaction networks. Consequently, various predictive approaches have emerged, ranging from graph-based models to methods based on the three-dimensional structure of reactants and products. In tandem, many representations have been developed to predict experimental targets, which may hold promise for barrier prediction as well. Here, we bring together all of these efforts and benchmark various methods (Morgan fingerprints, the DRFP, the CGR representation-based Chemprop, SLATMd, B²Rl², EquiReact and language model BERT + RXNFP) for the prediction of computed activation barriers on three diverse datasets. This record includes data to support the article "Benchmarking machine-readable vectors of chemical reactions on computed activation barriers". This supports the github repository https://github.com/lcmd-epfl/benchmark-barrier-learning which contains the codes and duplicates the data.
No Explore or Discover sections associated with this archive record.
File name | Size | Description |
---|---|---|
datasets.tar.gz
MD5md5:4a979a671d7fdbcdc99bbe4578c36a0f
|
944.0 MiB | The tar ball file `datasets.tar.gz` contains three folders corresponding to each dataset used in the article. Each of them contains the geometries (xyz-files), SMILES and properties (CSV-file), and the raw binary data (data-splits, results, and fingerprints/representations) See README.txt for more information. |
README.txt
MD5md5:f9d6e150a6b9bc932fcc7ca2bd535e97
|
5.2 KiB | README |
2024.163 (version v1) [This version] | Oct 16, 2024 | DOI10.24435/materialscloud:xd-10 |