Training sets based on uncertainty estimates in the cluster-expansion method

Cluster expansion (CE) has gained an increasing level of popularity in recent years, and many strategies have been proposed for training and fitting the CE models to first-principles calculation results. The paper reports a new strategy for constructing a training set based on their relevance in Monte Carlo sampling for statistical analysis and reduction of the expected error. We call the new strategy a "bootstrapping uncertainty structure selection" (BUSS) scheme and compared its performance against a popular scheme where one uses a combination of random structure and ground-state search (referred to as RGS). The provided dataset contains the training sets generated using BUSS and RGS for constructing a CE model for disordered Cu2ZnSnS4 material. The files are in the format of the Atomic Simulation Environment (ASE) database (please refer to ASE documentation for more information https://wiki.fysik.dtu.dk/ase/index.html). Each .db file contains 100 DFT calculations, which were generated using iteration cycles. Each iteration cycle is referred to as a generation (marked with gen key in the database) and each database contains 10 generations where each generation consists of 10 training structures. See more details in the paper.

Identifier
Source https://archive.materialscloud.org/record/2022.21
Metadata Access https://archive.materialscloud.org/xml?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:materialscloud.org:1240
Provenance
Creator Kleiven, David; Akola, Jaakko; Peterson, Andrew; Vegge, Tejs; Chang, Jin Hyun
Publisher Materials Cloud
Publication Year 2022
Rights info:eu-repo/semantics/openAccess; Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode
OpenAccess true
Contact archive(at)materialscloud.org
Representation
Language English
Resource Type Dataset
Discipline Materials Science and Engineering