The tar ball file `datasets.tar.gz` contains three folders corresponding to each dataset used in the article.
Each of them contains the geometries (xyz-files), SMILES and properties (CSV-file), and the raw binary data (data-splits, results, and fingerprints/representations)

./cyclo:
    full_dataset.csv            full dataset and target properties
    dataset_subset_750.csv      Subset splitting and properties
                                            B2R2(l)-model 
    b2r2_l_10_fold.npy          results on the 10 fold cross-validation datasplits
    b2r2_l_10_fold_xtb.npy      results on the 10 fold cross-validation datasplits (xtb geometries)
    b2r2_l.npy                  representations for the full dataset
    b2r2_l_xtb.npy              representations for the full dataset (xtb geometries)
                                            DRFP-model
    drfp_10_fold.npy            results on the 10 fold cross-validation datasplits
    drfp.npy                    representations for the full dataset
                                            MFP-model
    mfp_10_fold.npy             results on the 10 fold cross-validation datasplits
    mfp.npy                     representations for the full dataset    
                                            SLATM-model
    slatm_10_fold.npy           results on the 10 fold cross-validation datasplits
    slatm_10_fold_xtb.npy       results on the 10 fold cross-validation datasplits (xtb geometries)
                                Geometries
    xyz                         DFT-level geometries
    xyz-xtb                     xTB-level geometries

./gdb7-22-ts:
    ccsdtf12_dz.csv             ccsd-level computed  data and target properties
    ccsdtf12_dz_subset_750.csv  subset ccsd-level computed  data and target properties
    tr_sizes.npy                training sizes for each split
                                            B2R2(l)-model 
     b2r2_l_10_fold.npy         results on the 10 fold cross-validation datasplits
     b2r2_l_10_fold_xtb.npy     results on the 10 fold cross-validation datasplits (xtb geometries)
     b2r2_l.npy                 representations for the full dataset
     b2r2_l_xtb.npy             representations for the full dataset (xtb geometries)
                                            DRFP-model
     drfp_10_fold.npy           results on the 10 fold cross-validation datasplits
     drfp.npy                   representations for the full dataset
                                            MFP-model
    mfp_10_fold.npy             results on the 10 fold cross-validation datasplits
    mfp.npy                     representations for the full dataset    
                                            SLATM-model
                                results on the 10 fold cross-validation datasplits
    slatm_10_fold.npy           results on the 10 fold cross-validation datasplits (xtb geometries)
    slatm_10_fold_xtb.npy       
                                            Geometries
     xyz                        DFT-level geometries
     xyz-xtb                    xTB-level geometries
    

./proparg:
    data.csv                        full dataset and target properties
    data_fixarom_smiles.csv         fixed aromaticity
    data_fixarom_smiles_stereo.csv  fixed stereochemistry
    data_subset_750.csv             subset splitting
                                                B2R2(l)-model 
    b2r2_l_10_fold.npy              results on the 10 fold cross-validation datasplits
    b2r2_l_10_fold_xtb.npy          results on the 10 fold cross-validation datasplits (xtb geometries)
    b2r2_l.npy                      representations for the full dataset
    b2r2_l_xtb.npy                  representations for the full dataset (xtb geometries)
                                                DRFP-model
    drfp.npy                        representations for the full dataset
    drfp_10_fold.npy                results on the 10 fold cross-validation datasplits
    drfp_combinatorial.npy          representations for the full dataset
    drfp_combinatorial_10_fold.npy  results on the 10 fold cross-validation datasplits
    drfp_stereo.npy                 representations for the full dataset (including stereochemistry)
    drfp_stereo_10_fold.npy         results on the 10 fold cross-validation datasplits
                                                MFP-model
    mfp.npy                         representations for the full dataset
    mfp_10_fold.npy                 results on the 10 fold cross-validation datasplits
    mfp_combinatorial_10_fold.npy   representations for the full dataset
    mfp_combinatorial.npy           results on the 10 fold cross-validation datasplits
    mfp_stereo_10_fold.npy          representations for the full dataset (including stereochemistry)
    mfp_stereo.npy                  results on the 10 fold cross-validation datasplits
                                                SLATM-model
    slatm_10_fold.npy               results on the 10 fold cross-validation datasplits
    slatm_10_fold_xtb.npy           results on the 10 fold cross-validation datasplits (xtb geometries)
                                                Geometries
    xyz                             DFT-level geometries
    xyz-xtb                         xTB-level geometries