This is part of the Supporting Information for Improving the Silicon Interactions of GFN-xTB by Komissarov and Verstraelen

Repository contents

  • npz: Directory containing the optimization trajectories of 10k oraganosilicon compounds; calculated with the revPBE functional as implemented in the Amsterdam Modeling Suite 2021.203. Each file corresponds to the optimization trajectory of one compound; the PubChem compound ID corresponds to the file name. Data is stored in the NumPy npz format and can be loaded with np.load. Each file stores the following keys and it's values (R : Number of atoms in system, N : Number of optimization steps):

    • numbers: The compound's atomic numbers. Shape: (R,). Unit: -
    • xyz: Atomic positions at each geometry optimization step. Shape: (N, R, 3). Unit: angstom
    • gradients: Atomic gradients at each geometry optimization step. Shape: (N, R, 3). Unit: hartree/bohr
    • energy: System energy at each geometry optimization step. Shape: (N,). Unit: hartree

    Data at N=0 corresponds to the input structure, N=-1 corresponds to the final, optimized structure

  • sha1: Directory with SHA-1 hashes of all files from the npz directory. Can be used to check for consistency with the check_hash.py script

  • gfn-xtb-si: Directory with the optimized parameters in the AMS Format

  • gfn-xtb: Directory with the initial GFN1-xTB parameters, as published by Grimme et al.. Explicitly added K_SiO=1 for reparametrization

Files needed to start the parametrization with the ParAMS package

  • trainset.yaml.gz: Properties used for the parameter optimization (gzip-compressed YAML file)

  • valset.yaml.gz: Properties used for the optimized parameter validation (gzip-compressed YAML file)

  • jobcollection.yaml.gz: Job Collection, containing all jobs needed to calculate the properties of {trainset, valset}.yaml.gz

  • optimize.py: Optimization script