Training sets based on uncertainty estimates in the cluster-expansion method


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Kleiven, David</dc:creator>
  <dc:creator>Akola, Jaakko</dc:creator>
  <dc:creator>Peterson, Andrew</dc:creator>
  <dc:creator>Vegge, Tejs</dc:creator>
  <dc:creator>Chang, Jin Hyun</dc:creator>
  <dc:date>2022-02-03</dc:date>
  <dc:description>Cluster expansion (CE) has gained an increasing level of popularity in recent years, and many strategies have been proposed for training and fitting the CE models to first-principles calculation results. The paper reports a new strategy for constructing a training set based on their relevance in Monte Carlo sampling for statistical analysis and reduction of the expected error. We call the new strategy a "bootstrapping uncertainty structure selection" (BUSS) scheme and compared its performance against a popular scheme where one uses a combination of random structure and ground-state search (referred to as RGS). The provided dataset contains the training sets generated using BUSS and RGS for constructing a CE model for disordered Cu2ZnSnS4 material. The files are in the format of the Atomic Simulation Environment (ASE) database (please refer to ASE documentation for more information https://wiki.fysik.dtu.dk/ase/index.html). Each `.db` file contains 100 DFT calculations, which were generated using iteration cycles. Each iteration cycle is referred to as a generation (marked with `gen` key in the database) and each database contains 10 generations where each generation consists of 10 training structures. See more details in the paper.</dc:description>
  <dc:identifier>https://archive.materialscloud.org/record/2022.21</dc:identifier>
  <dc:identifier>doi:10.24435/materialscloud:ha-ca</dc:identifier>
  <dc:identifier>mcid:2022.21</dc:identifier>
  <dc:identifier>oai:materialscloud.org:1240</dc:identifier>
  <dc:language>en</dc:language>
  <dc:publisher>Materials Cloud</dc:publisher>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>BIG-MAP</dc:subject>
  <dc:subject>cluster expansion</dc:subject>
  <dc:subject>Monte Carlo</dc:subject>
  <dc:subject>phase transition</dc:subject>
  <dc:subject>bootstrapping</dc:subject>
  <dc:subject>machine learning</dc:subject>
  <dc:subject>energy materials</dc:subject>
  <dc:title>Training sets based on uncertainty estimates in the cluster-expansion method</dc:title>
  <dc:type>Dataset</dc:type>
</oai_dc:dc>