Description This directory contains a dataset for constructing HDNNPs (input.data) and local minima for different sizes of NaCl clusters predicted by ee4G-HDNNP (local_minima.tar.gz). Moreover, the input file for DFT calculations is included. The dataset contains structures, with energies and forces obtained from DFT calculations. Calculations were performed using Fhiaims (vesion:fhi-aims.200112_2) based on the PBE functional, with 'light' settings. Data format The datasets are provided in the input.data. The format is used by RuNNer. Each file contains multiple structures. The description of each structure begins with a line containing the keyword 'begin' and ends with a line containing the keyword 'end'. Each structure can contain lines starting with the following keywords: - lattice [X] [Y] [Y] (one of the three lattice vectors) - energy [energy] (total energy of the structure) - charge [charge] (total charge of the structure) - atom [X] [Y] [Z] [element] [charge] [energy] [Fx] [Fy] [Fz] (position, element, charge and force of each atom. [energy] could be used to define an atomic energy, however this is not used in the datasets and therefore set to 0) Units All quantities are given in atomic units: Bohr for distance, Ha for energy, elemental charge for charges, Ha/Bohr for forces. Datasets NaCl clusters All DFT calculations have been carried out using the electronic structure code FHI-aims (version: 200112\_2) with ``light'' settings. The PBE functional has been employed to describe electronic exchange and correlation. The SCF convergence criteria of the total energy, the volume integrated root mean square of the charge density and the sum of eigenvalues have been set to 10e-5 eV, 10e-5 e, and 10e-2 eV, respectively. The dataset consists of Na_{n}Cl_{n} and Na_{n}Cl_{n+1}^{-} clusters with n=16 and n=24 and the data generation can be divided into two stages. In the first stage, 10 smaller and 5 larger clusters for each charge state have been chosen to perform Born—Oppenheimer molecular dynamic simulations at 1000K for 5000 steps with a time step of 1.0 fs. The configurations of every 5th step in the trajectories have been included in the dataset. The initial structures for these trajectories have been obtained from minima hopping using BigDFT and then re-optimized with FHI-aims using the force threshold of 0.01 eV/\AA. A Nos\'{e}-Hoover thermostats was applied to run simulations in the canonical ensemble, and the effective mass was set to 1700 cm^{-1}. Apart form that, the paths of the geometry optimizations have also been included in the dataset in order to have a sufficient sampling in the lower energy region of configuration space. In the second stage extrapolating structures that are not well sampled in the configuration space at low temperature have been identified by running short MD simulations driven by a preliminary ee4G-HDNNP. Moreover, local minima, which have been generated by the Coulomb Lennard Jones empirical force field using Artificial Bee Colony algorithms and relaxed with the preliminary potential have also been included in the dataset if the optimized geometry exhibits large structural deviations compared to the DFT result. The final dataset consists of 33,592 structures including 10,627 Na_{16}Cl_{16} clusters, 5,822 Na_{24}Cl_{24} clusters, 11,217 Na_{16}Cl_{17}^- clusters, and 5,927 Na_{24}Cl_{25}^{-} clusters. 31,253 of these clusters have been obtained in the first stage, and 2,339 in the second stage. control.in The basic control file for DFT calculations using FHIaims with hirshfeld charge calculations local_minima.tar.gz The zipped folder contains six different system sizes (Na24Cl24_minima, Na24Cl25_minima, Na25Cl24_minima, Na62Cl62_minima, Na62Cl63_minima, Na63Cl62_minima). Each directory contain 50 files from 0.xyz to 49.xyz, which represents differet local minima in xyz format predicted by ee4G-HDNNP. The first coloumn represents elements and the second to fifth columns are atomic coordinates in angstrom. Reference [1] Ko, T. W., Finkler, J. A., Goedecker, S., & Behler, J. (2023). Accurate Fourth-Generation Machine Learning Potentials by Electrostatic Embedding, Journal of Chemical Theory and Computation, accepted.