-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[fix][broker]excessive replication speed leads to error: Producer send queue is full #24189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[fix][broker]excessive replication speed leads to error: Producer send queue is full #24189
Conversation
@poorbarcode is there any relationship to PIP-269: Add an epoch of cursor to discard outdated reading or any other previous reported issues ? |
It does not relate to |
/pulsarbot rerun-failure-checks |
424c175
to
1d6d5be
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #24189 +/- ##
============================================
+ Coverage 73.57% 74.21% +0.64%
- Complexity 32624 32683 +59
============================================
Files 1877 1867 -10
Lines 139502 145368 +5866
Branches 15299 16629 +1330
============================================
+ Hits 102638 107891 +5253
- Misses 28908 28931 +23
- Partials 7956 8546 +590
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
...roker/src/main/java/org/apache/pulsar/broker/service/persistent/GeoPersistentReplicator.java
Show resolved
Hide resolved
...r-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentReplicator.java
Outdated
Show resolved
Hide resolved
...r-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentReplicator.java
Outdated
Show resolved
Hide resolved
test failure:
could be a flaky test, attempt to fix was #24141 |
there seems to be more flakiness in tests. I can see the first attempt failed in a replication test as well
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please debug the flaky test failures in replication tests before we merge this change.
/pulsarbot rerun-failure-checks |
Fixed |
Motivation
Background
replicationProducerQueueSize
pendingMessages
that records how many messages are pending to be publishedfetchSchemaInProgress
astrue
waitForCursorRewinding
astrue
inflight cursor reading
, which was limited byhavePendingRead
.inflight publishing
Issue: The multiple mechanisms described above can not work well
read more entries A
read more entries B
1000
1000
havePendingRead -> true
1000
msgshavePendingRead -> true
1000
msgs are still in-progress1000
msgs2000
msgs in publishing, which is more than expected, get errorProducer send queue is full
read more entries
, leads the situation baderModifications
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: x