This is the landing page for the ReDD-COFFEE database, the Ready-to-use and Diverse Database of Covalent Organic Frameworks with Force field based Energy Evaluation. This database contains a diverse set of 268 687 covalent organic frameworks (COFs) and accompanying ab initio derived, system-specific force fields.
If you use any of this data, please cite the corresponding paper.
All 268 687 COF structures are collected in the provided tar files, which are ordered by linkage type and dimensionality. To untar the folders, run
tar -xf Linkage_ND.tar.xz
Each optimized structure file is named top_SBU1_SBU2_..._SBUN_optimized.chk
, with top
being the topology of the structure and SBUX
is the SBU that is placed on the X-th Wyckoff set of the topology. The structure files are provided in the molmod .chk
format (see molmod.github.io/molmod), but can be converted to .cif
files using one of the post-processing scripts.
The force field parameters of all structures are collected in the pars.txt
file. This is a Yaff Parameter file (see molmod.github.io/yaff), which can be directly adopted to start molecular simulations in Yaff. To generate the force field for a specific structure, run the following lines using a Python interpreter:
from yaff import System, ForceField
system = System.from_file(structure.chk)
ff = ForceField.generate(system, 'pars.txt', **kwargs)
For more information on how to start molecular simulations, or convert the files to other formats, we refer to the molmod and Yaff documentation.
extract_subset.sh
As discussed in the original paper, a subset can be extracted from the ReDD-COFFEE database that has approximately the same variety and disparity as the full database. Therefore, the structures are sorted by a maxmin-algorithm. The order of the structures is given in the ordered_maxmin.txt
file. To extract a subset, the extract_subset.sh
script can be adopted by running the following command in a terminal.
bash extract_subset.sh N
,
where N
is the number of structures that you require to be in the subset. If not provided, the total number of structures in the subset will be 10 000 and the same subset as mentioned in the paper will be obtained.
extract_pars.sh
Since the pars.txt
file contains the force field parameters of all structures, there are a lot of redundant lines when trying to load the parameters for one specific structure. To extract only the necessary parameters, the extract_pars.sh
script can be used:
bash extract_pars.sh struct.chk
bash extract_pars.sh struct1.chk struct2.chk
bash extract_pars.sh BoronateEster_2D/*.chk
bash extract_pars.sh */*.chk
For each structure argument, the script will generate a separate file pars_STRUCT.txt
, with STRUCT
being the structure name, containing only those force field parameters that are required to generate the force field for that specific structure. This file can replace the pars.txt
argument in the Python command above to generate the force field. The file will be placed in the same folder as the .chk
file mentioned in the argument.
write_to_cif.py
To convert the molmod .chk
format to the common .cif
format, the write_to_cif.py
script can be adopted. (This script requires the molmod and Yaff packages to be installed.)
python write_to_cif.py struct.chk
python write_to_cif.py struct1.chk struct2.chk
python write_to_cif.py BoronateEster_2D/*.chk
python write_to_cif.py */*.chk
For each structure argument, the script generates a separate file STRUCT.cif
, with STRUCT
being the structure name, in the same folder as the .chk
file.