Skip to content

imatrix: calculate activation-based statistics for new format (GGUF) imatrices #14891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

EAddario
Copy link
Contributor

Following up from #9400 and #12718, I've started tinkering with activation-based statistics, in addition to what's currently available via --show-statistics.

At the moment, I'm exploring three options going from from easy to implement and OK approximation, to some assembly required but fairly accurate:

  1. L2 norm of activation difference: where larger values would suggest the tensor has significantly transformed the input with respect to the previous layer.
  2. KL Divergence reduction using a pre-computed logit file: using a similar approach as described by nostalgebraist in logit lens, and based on a pre-computed logit file (e.g. from a previous llama-perplexity --save-all-logits run)
  3. Given that llama-imatrix already generates the actual logits to compute PPL, use Thông T. Nguyễn's logit prism approach to calculate the exact contribution of each layer to the final logit scores

Sharing with the readers, and in particular @compilade and @jukofyork, in case anyone's willing to double check assumptions and/or suggest alternative approaches I haven't considered.

@EAddario EAddario marked this pull request as draft July 26, 2025 16:47
if (!stat.activations.empty()) {
const int32_t nact = (int32_t) stat.activations.size();
struct ggml_tensor * in_sum = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, nact / nmat, nmat);
ggml_format_name(in_sum, "%s.in_sum", name.c_str()); // ToDo: consider a better name. 'in_act' maybe?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in_sum is fine, this fits with the intention of in_sum2.

Comment on lines +41 to 42
std::vector<float> activations;
std::vector<float> values;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to rename Stats.values to Stats.in_sum2, and Stats.activations to Stats.in_sum.

It should make it more obvious what maps to what in the resulting GGUF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants