Skip to content

Speed up (filtered) KNN queries for flat vector fields #130251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Jun 27, 2025

For dense vector fields using the flat index, we already know a brute-force search will be used, so there’s no need to go through the codec’s approximate KNN logic. This change skips that step and builds the brute-force query directly, making things faster and simpler.

I tested this on a setup with 10 million random vectors, each with 1596 dimensions and 17,500 partitions, using the random_vector track. The results:

Performance Comparison

Metric Before After Change
Throughput 221 ops/s 2762 ops/s 🟢 +1149%
Latency (p50) 29.2 ms 1.6 ms 🔻 -94.4%
Latency (p99) 81.6 ms 3.5 ms 🔻 -95.7%

Filtered KNN queries on flat vectors are now over 10x faster on my laptop!

For dense vector fields using the `flat` index, we already know a brute-force search will be used—so there’s no need to go through the codec’s approximate KNN logic. This change skips that step and builds the brute-force query directly, making things faster and simpler.

I tested this on a setup with **10 million random vectors**, each with **1596 dimensions** and **17,500 partitions**, using the `random_vector` track.
The results:

### Performance Comparison

| Metric            | Before    | After      | Change    |
| ----------------- | --------- | ---------- | --------- |
| **Throughput**    | 221 ops/s | 2762 ops/s | 🟢 +1149% |
| **Latency (p50)** | 29.2 ms   | 1.6 ms     | 🔻 -94.4% |
| **Latency (p99)** | 81.6 ms   | 3.5 ms     | 🔻 -95.7% |

Filtered KNN queries on flat vectors are now over 10x faster on my laptop!
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jun 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @jimczi, I've created a changelog YAML for you.

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am loving these numbers. Thank you for digging into this!

Comment on lines 2549 to 2563
Query knnQuery;
if (indexOptions != null && indexOptions.isFlat()) {
knnQuery = filter == null
? createExactKnnBitQuery(queryVector)
: new BooleanQuery.Builder().add(createExactKnnBitQuery(queryVector), BooleanClause.Occur.SHOULD)
.add(filter, BooleanClause.Occur.FILTER)
.build();
if (parentFilter != null) {
knnQuery = new ToParentBlockJoinQuery(knnQuery, parentFilter, ScoreMode.Max);
}
} else {
knnQuery = parentFilter != null
? new ESDiversifyingChildrenByteKnnVectorQuery(name(), queryVector, filter, k, numCands, parentFilter, searchStrategy)
: new ESKnnByteVectorQuery(name(), queryVector, k, numCands, filter, searchStrategy);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic makes me think that indexOptions should satisfy some interfaces for creating queries....but I think that refactor can happen later.

Comment on lines 198 to 207
// Retrieve top rescoreK documents from the inner query
var topDocs = searcher.search(innerQuery, rescoreK);
vectorOperations = topDocs.totalHits.value();

// Retrieve top k documents from the top rescoreK query
var topDocsQuery = new KnnScoreDocQuery(topDocs.scoreDocs, searcher.getIndexReader());
var valueSource = new VectorSimilarityFloatValueSource(fieldName, floatTarget, vectorSimilarityFunction);
var rescoreQuery = new FunctionScoreQuery(topDocsQuery, valueSource);
var rescoreTopDocs = searcher.search(rescoreQuery.rewrite(searcher), k);
return new KnnScoreDocQuery(rescoreTopDocs.scoreDocs, searcher.getIndexReader());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this because the exact queries don't actually do their own k limitations. Instead we must apply them with the searcher, right?

Just wanting to clarify, this looks very similar to what it was before other than forcing rescoreK in searcher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, two steps reduction since the first query is not a top k query

Comment on lines 2597 to 2599
if (parentFilter != null) {
knnQuery = new ToParentBlockJoinQuery(knnQuery, parentFilter, ScoreMode.Max);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I gotta think about this one...ESDiversifyingChildrenByteKnnVectorQuery actually returns the children docs so that nested queries work in top-level knn and when under a nested vector query.

If no tests fail here, I am thinking we don't actually have coverage for top-level & query level nested knn over flat indices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am writing up some yaml tests now to add to main to cover flat & nested

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right good catch. I pushed a fix to remain at the nested level but I realise now that rescoring in this mode is different. We'll rescore non-diversified children so we need to apply collapsing or similar. I'll think more about it while you're adding more tests, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants