Welcome! The data in this repository accompanies the publication entitled:

"cell2mol: Encoding Chemistry to Interpret Crystallographic Data"

by the authors:     
 
Sergi Vela,             sergi.vela@gmail.com // sergi.vela@ub.edu
Ruben Laplaza,          ruben.laplazasolanas@epfl.ch
Yuri Cho,               yuri.cho@epfl.ch
Clémence Corminboeuf    clemence.corminboeuf@epfl.ch

DOI: XXXX

and the code cell2mol (https://github.com/lcmd-epfl/cell2mol) developed in the LCMD group of the EPFL in Lausanne, of the same authors.

##############

The data is structured in 4 different categories and 11 tar.zip files.

- Files 1-8 contain interpreted UNIT CELLs for different transition metals.
- T-TMCs.tar.zip contains interpreted unique Transition Metal Complexes (TMCs, organized in subdirectories per transition metal).
- L-Ligands.tar.zip contains interpreted unique LIGANDS.
- O-Other.tar.zip contains interpreted unique species found in the unit cells.

The UNIT CELL folders include all unit cells of mono-metallic complexes that could be succesfully interpreted by cell2mol, and for which the metal oxidation state (OS) matches with the respective .cif file. There should be 31019 entries in total. Each unit cell is contained in a "cell" class object that includes the cell2mol interpretation. Instructions are given below on how to read a cell object. 

In the TMC, LIGANDS and OTHER files, there are the unique (i.e. non-repeated) species that appear in those unit sells. For instance, O-Other.tar.zip contains all unique non-complex molecules (i.e. counterions, solvents) that appear in the 31019 unit cells. The .gmol files in these folders are "Molecule", "Ligand" and "Molecule" class objects respectively. Instructions on how to read them are also given below.  

###############

The cell, Ligand, and Molecule objects can be loaded with pickle, and their definitions are shown in the tmcharge_common.py file of cell2mol. Additionally, simple python scripts are provided in the "utils" folder of the github repository. An example is also provided in this repository: "check_gmol.py". This script should read any of the object files in this repository and provide some basic information.  

For additional questions, don't hesitate to contact any of the cell2mol developers and the email addresses given above. 

#############