Electronic excited states from physically-constrained machine learning
JSON Export
{
"revision": 5,
"id": "2055",
"created": "2024-01-17T09:37:37.613035+00:00",
"metadata": {
"doi": "10.24435/materialscloud:5s-gm",
"status": "published",
"title": "Electronic excited states from physically-constrained machine learning",
"mcid": "2024.11",
"license_addendum": null,
"_files": [
{
"description": "README describing the repository architecture and data",
"key": "README.md",
"size": 2790,
"checksum": "md5:2dbafadea9f66beacf24bd2a8c703399"
},
{
"description": "Dataset of hydrocarbons",
"key": "dataset.tar.gz",
"size": 4255817929,
"checksum": "md5:e721a9e7455782365b96eb5c9d4a2ae2"
}
],
"owner": 219,
"_oai": {
"id": "oai:materialscloud.org:2055"
},
"keywords": [
"ERC",
"hamiltonian",
"excited states",
"machine learning",
"EPFL",
"FIAMMA",
"LIFETimeS"
],
"conceptrecid": "2054",
"is_last": false,
"references": [
{
"type": "Preprint",
"url": "https://arxiv.org/abs/2311.00844",
"comment": "Preprint in which the data is described",
"citation": "E. Cignoni, D. Suman, J. Nigam, L. Cupellini, B. Mennucci, and M. Ceriotti, arXiv preprint arXiv:2311.00844."
},
{
"type": "Software",
"url": "https://github.com/ecignoni/halex/",
"comment": "Github repository with the code for generating data and machine learning",
"citation": "E. Cignoni, Hamiltonian learning for excited states (HaLEx)"
}
],
"publication_date": "Jan 23, 2024, 11:06:07",
"license": "Creative Commons Attribution 4.0 International",
"id": "2055",
"description": "Data-driven techniques are increasingly used to replace electronic-structure calculations of matter. In this context, a relevant question is whether machine learning (ML) should be applied directly to predict the desired properties or be combined explicitly with physically-grounded operations. We present an example of an integrated modeling approach, in which a symmetry-adapted ML model of an effective Hamiltonian is trained to reproduce electronic excitations from a quantum-mechanical calculation. The resulting model can make predictions for molecules that are much larger and more complex than those that it is trained on, and allows for dramatic computational savings by indirectly targeting the outputs of well-converged calculations while using a parameterization corresponding to a minimal atom-centered basis. Our results on a comprehensive dataset of hydrocarbons emphasize the merits of intertwining data-driven techniques with physical approximations, improving the transferability and interpretability of ML models without affecting their accuracy and computational efficiency, and providing a blueprint for developing ML-augmented electronic-structure methods.\nHere we include the dataset, accompanying the paper linked below, of hydrocarbons including ethane, ethene, butadiene, hexane, hexatriene, isoprene, styrene, polyalkenes (dodecahexaene, tetradecaheptaene, hexadecaoctaene, octadecanonaene, eicosadecaene), aromatics (benzene, azulene, naphthalene, biphenyl), anthracene, beta-carotene, fullerene. We also provide scripts to generate the Fock and overlap matrices in this dataset. The code for machine learning can be found at the Software reference below.",
"version": 1,
"contributors": [
{
"email": "edoardo.cignoni@phd.unipi.it",
"affiliations": [
"Dipartimento di Chimica e Chimica Industriale, Universit\u00e0 di Pisa, Pisa, Italy"
],
"familyname": "Cignoni",
"givennames": "Edoardo"
},
{
"affiliations": [
"Laboratory of Computational Science and Modeling, IMX, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
],
"familyname": "Suman",
"givennames": "Divya"
},
{
"email": "jigyasa.nigam@epfl.ch",
"affiliations": [
"Laboratory of Computational Science and Modeling, IMX, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
],
"familyname": "Nigam",
"givennames": "Jigyasa"
},
{
"affiliations": [
"Dipartimento di Chimica e Chimica Industriale, Universit\u00e0 di Pisa, Pisa, Italy"
],
"familyname": "Cupellini",
"givennames": "Lorenzo"
},
{
"affiliations": [
"Dipartimento di Chimica e Chimica Industriale, Universit\u00e0 di Pisa, Pisa, Italy"
],
"familyname": "Mennucci",
"givennames": "Benedetta"
},
{
"affiliations": [
"Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA",
"Laboratory of Computational Science and Modeling, IMX, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, 1015 Lausanne, Switzerland"
],
"familyname": "Ceriotti",
"givennames": "Michele"
}
],
"edited_by": 576
},
"updated": "2024-02-20T13:55:46.901457+00:00"
}