The data contained inside OTPD_data.tar.gz are organized in 4 folders: 1) AbInitioData: contains 1 compressed folder (AbInitioDATA.tar.gz) and 2 text files (Active_space_definitions.dat, All_C0_2.dat): AbInitioDATA.tar.gz: contains a folder for each of the 3615 molecules in the GDB11-AD-3165 database. Each folder contains 3 text files: A) MCPDFT.input: Input for the MCPDFT computation in OpenMolcas. B) MCPDFT.log: Output of the MCPDFT computation. C) MCPDFT.McDens (8 columns): as given by the MCPDFT computation. 1st, 2nd, 3rd column: x,y,z coordinates of the atom-centered integration grid. 4th and 5th column: amplitude of the modified MCPDFT spin-densities (alpha and beta) on the grid. 6th column: electron density amplitudes on the grid. 7th column: on-top pair density amplitudes on the grid. 8th column: on-top ratio on the grid. Active_space_definitions.dat (4 columns): contains the essential information for the definition of active space in OpenMolcas. 1st column: Name of the compound as appears in the GDB11-AD-3165 database. 2nd column: Number of electrons in the active space. 3rd column: Number of orbitals in the active space. 4th column: Number of inactive (doubly occupied) orbitals. All_C0_2.dat (2 columns): contains the weight of the dominant electronic configuration for each molecule. 1st column: Name of the compound as appears in the GDB11-AD-3165 database. 2nd column: Weight of the dominant electronic configuration (|C0|^2). 2) Basis_Decomposition: contains 2 text files: Decomposition_error_OTPD_basis.txt (3 columns). 1st column: Name of the compound as appears in the GDB11-AD-3165 database. 2nd column: Absolute decomposition error of the on-top pair density field (as defined in the manuscript). 3rd column: Relative decomposition error of the on-top pair density field (as defined in the manuscript). OTPD_basis.txt: contains the optimized exponents of the specialized OTPD basis used in this work. The basis is formatted as a standard python dictionary: A) the keys are the element characters (H,C,O,N) B) the arguments are lists specifying the angular momentum, the exponent and the contraction coefficient of each basis function. 3) Geometries: contains 2 subfolders: train: geometry of each compound included in the training set (xyz format). test: geometry of each compound included in the test set (xyz format). 4) Predictions: contains 1 text file (OTPD_Prediction_error_test_set.dat) and 2 subfolders (PI and RHO2): OTPD_Prediction_error_test_set.dat (3 columns): contains the prediction errors of the on-top pair density for the test set. 1st column: Name of the compound as appears in the GDB11-AD-3165 database. 2nd column: Weight of the dominant electronic configuration (|C0|^2). 3rd column: Relative prediction error of the on-top pair density field (as defined in the manuscript). PI: contains the basis set expansion coefficients of the on-top pair density (Eq. 2, main text) predicted for each of the molecules in the test set. The coefficients are stored in the order: for each_atom(i) in molecule for each_radial_channel(n) in basis for each_angular_momentum(l) in radial_channel for each_magnetic_quantum_number(m) in l RHO2: contains the basis set expansion coefficients of the square of the electron density, predicted for each of the molecules in the test set. The coefficients are stored in the same order as PI.