Skip to content

Conversation

@jjezra
Copy link
Contributor

@jjezra jjezra commented Oct 21, 2025

Add support for index filters by predicates or by IndexMaintenanceFilter.
However, Lucene can only support ALL or NONE. Filtering SOME will cause an exception.

Resolves #3065

   Add support for index filters by predicates or by IndexMaintenanceFilter.
   However, Lucene can only support ALL or NONE. Filtering SOME will cause an exception.

   Resolves FoundationDB#3065
@jjezra jjezra added the bug fix Change that fixes a bug label Oct 22, 2025
@jjezra jjezra requested a review from ohadzeliger October 22, 2025 16:14
@jjezra jjezra marked this pull request as ready for review October 22, 2025 16:26
.setSerializer(TextIndexTestUtils.COMPRESSING_SERIALIZER);
if (filterOut) {
recordStore = builder
.setIndexMaintenanceFilter((i, r) -> IndexMaintenanceFilter.IndexValues.NONE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably more absolute than we want.
It seems valuable to have this be a filter than is conditional in some way on the record. i.e. some field is even, or if it is equal to (or not equal to) a specific value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests in LuceneScanAllEntriesTest.

.setSerializer(TextIndexTestUtils.COMPRESSING_SERIALIZER);
if (filterOut) {
recordStore = builder
.setIndexMaintenanceFilter((i, r) -> IndexMaintenanceFilter.IndexValues.NONE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems worthwhile to test predicate support too, perhaps parameterizing this method, and the eventual test for whether to filter with a predicate or a filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (int i = 0; i < iLast; i++) {
recordStore.saveRecord(multiEntryMapDoc(77L * i, ENGINEER_JOKE + iLast, group));
}
final Set<Index> indexSet = recordStore.getIndexDeferredMaintenanceControl().getMergeRequiredIndexes();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually assert that anything was filtered out, just that Lucene didn't connect to the deferred maintenance control. It seems better to be testing the actual desired behavior, which is that if you filter something out, it doesn't show up when you scan the index.
LuceneIndexTestDataModel may be helpful here, by removing everything from the data model that should be filtered out before validating that the index is consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 787 to 788
int groupingCount = 1;
final int groupedCount = 4 - groupingCount;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two could probably be inlined, although that might become moot if you use LuceneIndexTestDataModel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

@Nullable
private <M extends Message> FDBIndexableRecord<M> maybeFilterRecord(FDBIndexableRecord<M> record) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rec I believe is what I have seen in most places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjezra jjezra requested a review from ScottDugas October 24, 2025 15:49
assertNotEquals(needMerge, filterOut);
}

private TestRecordsTextProto.MapDocument multiEntryMapDoc(long id, String text, int group) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a test that fails:

  • filter returns SOME
  • filter throws exception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Change that fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lucene should support index filters

3 participants