Scripts for MFDeltaML and data efficiency benchmarks of DeltaML, MFML, o-MFML, MFDeltaML, o-MFDeltaML on the QeMFi dataset. This is performed for the following QC properties:
- ground state energies
- first vertical excitation energies
- second vertical excitation energies
- magnitude of electric comtribution to molecular dipole moment
The scripts provided in this code repository can be used to reprodce the results of the manuscript titled 'Benchmarking Data Efficiency in Delta-ML and Multifidelity Models for Quantum Chemistry' available as a preprint at https://arxiv.org/abs/2410.11391. The python library requirements can be found in requirements.txt in this code repository.
The data used in this work is the QeMFi dataset which can be accessed at the following URL: https://zenodo.org/records/12734761
Model_MFML.pyis the main python module required to run the MFML and o-MFML models.SF_DeltaML.pygenerates the Delta ML learning curves for different QC-baseline.MFML_LCsAll.pygenerates the data for MFML and o-MFML learning curves.MFDeltaML_LCs.pygenerates the data for the MFDeltaML and o-MFDeltaML models, two new methods introduced in this work.PredictLowest.pygenerates the data required for Fig.5 from the manuscript. It generates a hybrid DeltaML model where the baseline is not a QC computed one but is rather predicted using a single fidelity ML model.- The jupyter notebook
DeltaMFML.ipnybcontains all the plotting routines and the data split (into tes, train, and validation) codes.