-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partitions in Zeebe are stuck at 100% backpressure forever #4482
Comments
For more info:
|
[Update] I had to manually analyse topology and restart throttling leader partition brokers one by one. I noticed a pattern, whenever we run a backup, we see this state. |
The ticket was incorrectly opened for Camunda 7. The user already reported the ticket for Camunda 8: camunda/camunda#20126 |
Environment (Required on creation)
Zeebe: 8.5.2
Total Partitions: 16
Nodes: 8
Each Zeebe node is 16GB and 4Core pod
Description (Required on creation; please attach any relevant screenshots, stacktraces, log files, etc. to the ticket)
We have noticed that some partitions would permanently start firing backpressure 100% even though load is limited.
We see all partitions as healthy but backpressure % is 100 for some of the partitions
On some metric observations I see this
Job activated per second is also 0
PVC / CPU / Memory is normal
Some stack traces from one of the brokers
Steps to reproduce (Required on creation)
Not really sure
Observed Behavior (Required on creation)
Partitions are stuck with backpressure 100% and system is not responding
Expected behavior (Required on creation)
Backpressure should be auto released and should start taking
Root Cause (Required on prioritization)
Solution Ideas
Hints
Links
Attached logs when this happened. Note time is in UTC. Diagrams have IST, so UTC + 5.30. Sorry for this
logs-insights-results (4).csv
Breakdown
Pull Requests
Dev2QA handover
The text was updated successfully, but these errors were encountered: