Skip to content

Remove indexPath when computing hash value used for naming exact NN file #403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

huynmg
Copy link
Contributor

@huynmg huynmg commented Jun 4, 2025

#391

Exact NN should be independent of the indexPath as we should remove it when computing hash value used for naming exact NN File. This will help reduce some extra computation when the indexPath changes but the exact NNs doesn't change .e.g. when benchmarking different quantization level for the same data.

Re-running the benchmarks before & after the change:

Before (exact NN are computed 2 times as indexPath is part of the hash computation)

recall  latency(ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.832        0.630  100000   100      50       64        250     4 bits     16.05       6231.31             1          116.66       110.245       12.589       HNSW
 0.898        0.810  100000   100      50       64        250     7 bits     22.29       4487.32             1          129.09       122.452       24.796       HNSW

After (exact NN is reused)

recall  latency(ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.837        0.540  100000   100      50       64        250     4 bits     17.96       5567.62             1          116.68       110.245       12.589       HNSW
 0.897        0.800  100000   100      50       64        250     7 bits     25.18       3971.09             1          128.89       122.452       24.796       HNSW

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant