Each dataset (gdb/
for GDB7-22-TS [1],
cyclo/
for Cyclo-23-TS [2],
proparg/
for Proparg-21-TS [3,4]) directory contains:
xyz/
— the original (DFT) geometries.xyz-xtb/
— GFN2-xTB geometries.{dataset}.csv
— the CSV file that contains:idx
/ rxn_id
/ (mol
,enan
): reaction indices used to find the corresponding xyz files.dE0
/ G_act
/ Eafw
: target property.rxn_smiles
: unmapped reaction SMILESrxn_smiles_mapped
: the original ("true") atom-mapped SMILESrxn_smiles_rxnmapper
: SMILES mapped by RXNMapper [5]rxn_smiles_rxnmapper_full
: SMILES mapped by RXNMapper including hydrogensbad_xtb
: is the reaction is excluded from the geometry quality tests (xTB optimization failed)Additionally,
proparg/proparg-weird-smiles.csv
: "bad" SMILES for Proparg-21-TS automatically obtained from xyz
taken from [6].
They are also mapped by RXNMapper but were not used to produce the results of the paper.