BE: Full text search support #1267

germanosin · 2025-08-14T08:16:13Z

Breaking change? (if so, please describe the impact and migration path for existing application instances)

What changes did you make? (Give an overview)

Is there anything you'd like reviewers to focus on?

How Has This Been Tested? (put an "x" (case-sensitive!) next to an item)

Checklist (put an "x" (case-sensitive!) next to all the items, otherwise the build will fail)

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (e.g. ENVIRONMENT VARIABLES)
My changes generate no new warnings (e.g. Sonar is happy)
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged

Check out Contributing and Code of Conduct

A picture of a cute animal (not mandatory but encouraged)

fallen-up · 2025-08-18T11:58:28Z

@germanosin
is the option KAFKA_FTS_ENABLED: "true" enough for testing?

germanosin · 2025-08-18T13:22:11Z

@germanosin is the option KAFKA_FTS_ENABLED: "true" enough for testing?

Yes correct, and you could adjust WITH
KAFKA_FTS_TOPICSMINNGRAM, KAFKA_FTS_TOPICSMAXNGRAM for topics index
KAFKA_FTS_FILTERMINNGRAM, KAFKA_FTS_FILTERMAXNGRAM for other searches

germanosin · 2025-08-21T17:24:56Z

@fallen-up new config:
KAFKA_FTS_TOPICSNGRAMENABLED: true / false

iliax · 2025-08-23T05:08:45Z

api/src/main/java/io/kafbat/ui/service/index/ShortWordAnalyzer.java

+import org.apache.lucene.analysis.miscellaneous.WordDelimiterGraphFilter;
+import org.apache.lucene.analysis.standard.StandardTokenizer;
+
+public class ShortWordAnalyzer extends Analyzer {


package-private

iliax · 2025-08-23T05:09:00Z

api/src/main/java/io/kafbat/ui/service/index/ShortWordNGramAnalyzer.java

+import org.apache.lucene.analysis.ngram.NGramTokenFilter;
+import org.apache.lucene.analysis.standard.StandardTokenizer;
+
+public class ShortWordNGramAnalyzer extends Analyzer {


package-private

iliax · 2025-08-23T05:10:25Z

api/src/main/java/io/kafbat/ui/service/index/NgramFilter.java

+
+  protected abstract List<Tuple2<List<String>, T>> getItems();
+
+  private static Map<String, List<String>> cache = new ConcurrentHashMap<>();


private static final Map<String, List<String>> cache = CacheBuilder.newBuilder() .maximumSize(1_000) .<String, List<String>>build() .asMap();

iliax · 2025-08-23T05:13:22Z

api/src/main/java/io/kafbat/ui/service/index/NgramFilter.java

+  }
+
+
+  public static List<String> tokenizeString(Analyzer analyzer, String text) throws IOException {


package-private

iliax · 2025-08-23T05:13:27Z

api/src/main/java/io/kafbat/ui/service/index/NgramFilter.java

+  }
+
+  @SneakyThrows
+  public static List<String> tokenizeStringSimple(Analyzer analyzer, String text) {


package-private

iliax · 2025-08-23T05:15:56Z

api/src/main/java/io/kafbat/ui/service/metrics/scrape/ScrapedClusterState.java

+        return new TopicsIndex(topicStates.values().stream().map(
+            topicState -> buildInternalTopic(topicState, clustersProperties)
+        ).toList(), fts.isTopicsNgramEnabled(), fts.getTopicsMinNGram(), fts.getTopicsMaxNGram());


make TopicsIndex contructor take FtsProperties as an argument, this creation looks overwhelmed

iliax · 2025-08-23T05:21:28Z

api/src/main/java/io/kafbat/ui/service/index/TopicsIndex.java

+        doc.add(new TextField(FIELD_NAME, topic.getName(), Field.Store.NO));
+        doc.add(new IntPoint(FIELD_PARTITIONS, topic.getPartitionCount()));
+        doc.add(new IntPoint(FIELD_REPLICATION, topic.getReplicationFactor()));
+        doc.add(new LongPoint(FIELD_SIZE, topic.getSegmentSize()));


it looks like segmentsize can be 0 in two cases - when it is unknown, and when topic is empty. For the first case maybe we should not index this field?

iliax · 2025-08-23T05:31:05Z

api/src/main/java/io/kafbat/ui/service/index/SchemasFilter.java

+  public List<String> find(String search) {
+    if (fts) {
+      return super.find(search);
+    } else {
+      return this.subjects
+          .stream()
+          .map(Tuple2::getT2)
+          .filter(subj -> search == null || CI.contains(subj, search))
+          .sorted().toList();
+    }


lets either do this check inside all filters or check if fts enabled on upper level (like its done for ConsumerGroupFilter).

I suggest to always create filters and put this check inside / create different impls. It will make calling code cleaner.

Also, maybe rename NgramFilter to SearchFilter or smth and implement search alg (fts/ngram/etc) according to properties

smth like this

public class SearchFilters { public static SearchFilter consumerGroupFilter(Collection<String> g, FtsProperties fts) { if (fts.isEnabled()){ return new ConsumerGroupNgramFilter(g, fts.getFilterMinNGram(), fts.getFilterMaxNGram()); } return new CaseInsensitiveContainsFilter(g); } ...

iliax · 2025-08-23T05:40:16Z

api/src/main/java/io/kafbat/ui/config/ClustersProperties.java

+    int topicsMinNGram = 3;
+    int topicsMaxNGram = 5;
+    int filterMinNGram = 1;
+    int filterMaxNGram = 4;


maybe create

class NgramSettings { int minNGram = 1; int maxNGram = 4; }

and tune it for each search

public static class FtsProperties { ... NgramSettings topiscNgram = new NgramSettings(3,5); NgramSettings schemasNgram = new NgramSettings(1,4); NgramSettings consumerGroupsNgram = new NgramSettings(1,4); }

germanosin added 29 commits June 10, 2025 00:15

Use typespec contracts

8a03bfd

Merge branch 'main' into typespec

7d5af90

Fixed styling

b8b91fc

Merge remote-tracking branch 'origin/typespec' into typespec

155b038

Fixed styling

a2175a1

enable typespec by default

5d4f419

Added info

8cd0a0a

Frontend use new contracts

6ce97f8

Fixes in contracts for backward compatibility

a7a59d8

use typespec build

d0ef634

Install pnpm dependencies

c29091f

Merged main

d67068c

Actualize typespec

f114d44

Actualize typespec

4279d6c

fixed frontend build

a1c261b

fixed frontend build

ea75152

fixed frontend build

5341c20

fixed frontend build

c61da13

fixed frontend build

2c670f6

fixed frontend build

4bf3779

fixed frontend build

fa8b03b

Enabled contract validation

bcae096

Removed frontend test report

db746fa

Merge branch 'main' into typespec

a42e81e

Synced with main

38ee6c9

Synced with main

0474fd7

Added lucene

c27b32d

Close stats

153c44d

fts

c9a6d94

germanosin requested a review from a team as a code owner August 14, 2025 08:16

germanosin requested review from a team as code owners August 14, 2025 08:16

kapybro bot added status/triage Issues pending maintainers triage status/triage/completed Automatic triage completed and removed status/triage Issues pending maintainers triage labels Aug 14, 2025

Merge branch 'main' into issues/1087-fts

3706fb2

germanosin marked this pull request as draft August 14, 2025 08:16

germanosin changed the title ~~Issues/1087 fts~~ BE: Full text search support Aug 14, 2025

germanosin added 2 commits August 14, 2025 19:26

Merge branch 'main' into issues/1087-fts

d887668

Merge branch 'main' into issues/1087-fts

56889da

germanosin marked this pull request as ready for review August 15, 2025 12:01

Haarolean added this to the 1.4 milestone Aug 15, 2025

github-project-automation bot added this to Release 1.4 Aug 15, 2025

github-project-automation bot moved this to Todo in Release 1.4 Aug 15, 2025

Haarolean added scope/backend Related to backend changes type/feature A brand new feature area/ux User experiense issues labels Aug 15, 2025

Merge branch 'main' into issues/1087-fts

bb49399

germanosin linked an issue Aug 18, 2025 that may be closed by this pull request

BE: Add a full text search #1087

Open

2 tasks

germanosin added 2 commits August 21, 2025 19:20

Ngram config and toics optimizations

6da3269

Merge branch 'main' into issues/1087-fts

41a4b52

iliax reviewed Aug 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BE: Full text search support #1267

BE: Full text search support #1267

Uh oh!

germanosin commented Aug 14, 2025 •

edited

Loading

Uh oh!

fallen-up commented Aug 18, 2025

Uh oh!

germanosin commented Aug 18, 2025

Uh oh!

germanosin commented Aug 21, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025

Uh oh!

iliax Aug 23, 2025 •

edited

Loading

Uh oh!

iliax Aug 23, 2025

Uh oh!

Uh oh!


		protected abstract List<Tuple2<List<String>, T>> getItems();

		private static Map<String, List<String>> cache = new ConcurrentHashMap<>();

		}


		public static List<String> tokenizeString(Analyzer analyzer, String text) throws IOException {

Uh oh!

BE: Full text search support #1267

Are you sure you want to change the base?

BE: Full text search support #1267

Uh oh!

Conversation

germanosin commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fallen-up commented Aug 18, 2025

Uh oh!

germanosin commented Aug 18, 2025

Uh oh!

germanosin commented Aug 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iliax Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

germanosin commented Aug 14, 2025 •

edited

Loading

iliax Aug 23, 2025 •

edited

Loading