Investigating Data Hierarchies in Multifidelity Machine Learning for Excitation Energies

Scripts for multifidelity models used to assess data hierarchy scaling for the prediction of excitation energies of molecules. This also includes scripts to generate the newly introduced Gamma curve. The scripts given here can be used to generate the figures from the manuscript of the same title hosted at https://arxiv.org/abs/2410.11392. The requirements.txt file contains all required packages to run the scripts given herein. The dataset used in this work is hosted freely at this ZENODO repository.

The python file Model_MFML.py is the module that was developed in this previous work and contains both both MFML and o-MFML implementations that are used in this work.
PrepFromQeMFi.py separates the data from the QeMFi dataset into train, test, and validation datasets.
The jupyter notebook Plots.ipynb. offers the different functions to reproduce the plots from the manuscript.
LearningCurve.py generates the data needed to assess the different fixed scaling factors ($\gamma$)
The script RatioTimeBasedScalingFactor.py produces the learning curves for the scaling factors defined as $\theta_{f-1}^f$.
TargetFidelityTimeRatioScalingFactor.py generates the data for scaling factors defined as $\theta_f^F$ in the manuscript.
The script ErrorContours_gamma2.py will generate the data needed to plot the error contours of MFML (Fig 6 of manuscript).
GammaCurve.py creates all the data points needed to assess the different $\Gamma(N_{train}^{TZVP})$ from the manuscript. The value of ntop can be changed based on $N_{train}^{TZVP}$.
The scripts saveindexfortimevsmae.py and saveindex_extendedgamma.py are used to get the indices of the training samples used so they can be used to generate the time-cost plots.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Investigating Data Hierarchies in Multifidelity Machine Learning for Excitation Energies

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
ErrorContours_gamma2.py		ErrorContours_gamma2.py
GammaCurve.py		GammaCurve.py
LICENSE		LICENSE
LearningCurves.py		LearningCurves.py
Model_MFML.py		Model_MFML.py
Plots.ipynb		Plots.ipynb
PrepFromQeMFi.py		PrepFromQeMFi.py
README.md		README.md
RatioTimeBasedScalingFactor.py		RatioTimeBasedScalingFactor.py
TargetFidelityTimeRatioScalingFactor.py		TargetFidelityTimeRatioScalingFactor.py
requirements.txt		requirements.txt
saveindex_extendedgamma.py		saveindex_extendedgamma.py
saveindexfortimevsmae.py		saveindexfortimevsmae.py

License

vivinvinod/MFML_DataHierarchy

Folders and files

Latest commit

History

Repository files navigation

Investigating Data Hierarchies in Multifidelity Machine Learning for Excitation Energies

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages