This folder contains the sets of structures and electronic energies used to recreate the figures from the article. The directory "training_data" contains the data used to train the ML models: - xyz_files: .xyz files of 2000 structures extracted using FPS on 300K canonical sampling for each dipeptide, plus 3376 structures of peptide dimers. - dftb_d3bj: DFTB-D3(BJ) energies (kcal/mol) for each structure in xyz_files. - pbe0_d3bj: PBE0-D3 energies (kcal/mol) for each structure in xyz_files. The "tripeptide_ml_test_data" contains the data used to evaluate the ML performance on the tripeptide: -dftb & dft 0K: 100 optimized geometries and electronic energies generated with DFTB-D3(BJ) and PBE0-D3 -dftb & dft 300K: 900 geometries and electronic energies obtained with 300K canonical sampling generated with DFTB-D3(BJ) and PBE0-D3 The "tripeptide_samplings" contains the results of the sampling simulations: - dftb_sampling.xyz: structures generated from 300K sampling using DFTB-D3(BJ) and Temperature Replica Exchange - lkr_sampling.xyz: structures generated from 300K sampling using DFTB-D3(BJ) + LKR and Hamiltonian-Reservoir Replica Exchange - nn_sampling.xyz: structures generated from 300K sampling using DFTB-D3(BJ) + NN and ATLAS metadynamics. The weights of the structures are in the property lines of the xyz entries.