Skip to content

Kafka 4 Share Groups in Logstash Kafka Input Plugin #346

@metalshanked

Description

@metalshanked

Is your feature request related to a problem? Please describe.

The current Logstash Kafka input plugin relies on traditional Kafka consumer groups. While effective for many use cases, this model has limitations, particularly when dealing with high-throughput, unordered data where the number of partitions limits the scaling of consumers. The introduction of Kafka Share Groups (KIP-932) in Kafka 4.0 offers a new paradigm for consumer scaling and resilience by enabling true queueing semantics and allowing more consumers than partitions.
This feature request aims to enhance the Logstash Kafka input plugin by adding support for Kafka Share Groups. This would enable Logstash to leverage the benefits of Share Groups for suitable data ingestion pipelines.

Describe the solution you'd like

We propose adding a new configuration option to the Logstash Kafka input plugin to enable the use of Kafka Share Groups. This option could be a boolean flag or an enum allowing users to choose between the traditional consumer group model and the new Share Group model.
When Share Group mode is enabled, the plugin should:
Set the consumer group type to "share" when creating the Kafka consumer.
Utilize the new Share Group client APIs for subscribing to topics and consuming messages.
Implement the message acknowledgement (ACCEPT, RELEASE, REJECT) mechanism as required by the Share Group protocol. This might involve introducing new configuration options to control retry behavior (e.g., number of delivery attempts, dead-letter topic for rejected messages).
Potentially expose new metrics related to Share Group consumption, such as the number of acquired, released, and archived messages.

Describe alternatives you've considered

The alternative would be to continue using the traditional consumer group model, which, as described in the accompanying article, has inherent scaling limitations for certain types of workloads. Users might also try to implement custom logic outside of Logstash to achieve similar queueing semantics, but this would add significant complexity and operational overhead.

Additional context

The attached article provides a detailed explanation of Kafka Share Groups, their benefits, and use cases, particularly highlighting their value in high-throughput, independent task processing scenarios like real-time cybersecurity analytics. Integrating Share Group support into Logstash would make it a more versatile and scalable data ingestion tool for a wider range of Kafka-based architectures.

Importance

Major

Use cases

High-volume log ingestion where individual log lines can be processed independently and scaling the number of processing instances beyond the number of partitions is desired.
Real-time processing of security events or other telemetry data where rapid horizontal scaling is crucial during traffic spikes.
Scenarios where resilience to "poison pill" messages and fine-grained control over message acknowledgment are important.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions