[alerter]bugfix: preserve group alert recovery events#4168
Open
hutiefang76 wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes two alert-group lifecycle edge cases in the alerter pipeline: ensuring recovery (resolved) events are not suppressed by the firing repeat-interval throttle, and ensuring persisted group status reflects the aggregate state of all known member alerts (not just the current push window).
Changes:
- Bypass the group repeat-interval throttle when the group payload includes any resolved member alerts.
- Recompute a group’s persisted status in the DB store layer from all member alert statuses (by fingerprint), and add regression tests for both reducer and store paths.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| hertzbeat-alerter/src/main/java/org/apache/hertzbeat/alert/reduce/AlarmGroupReduce.java | Adjusts repeat-throttle behavior so mixed firing groups with resolved members are still dispatched. |
| hertzbeat-alerter/src/main/java/org/apache/hertzbeat/alert/notice/impl/DbAlertStoreHandlerImpl.java | Recomputes persisted group status from stored member alerts before saving. |
| hertzbeat-alerter/src/test/java/org/apache/hertzbeat/alert/reduce/AlarmGroupReduceTest.java | Adds a regression test for “resolved bypasses repeat throttle”. |
| hertzbeat-alerter/src/test/java/org/apache/hertzbeat/alert/notice/impl/DbAlertStoreHandlerImplTest.java | Adds a regression test ensuring group status stays firing when other members still fire. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
294
to
296
| AlertGroupConverge ruleConfig = groupDefines.get(cache.getGroupDefineName()); | ||
| long repeatInterval = ruleConfig.getRepeatInterval() != null | ||
| ? ruleConfig.getRepeatInterval() * MS_PER_SECOND : DEFAULT_REPEAT_INTERVAL; |
Comment on lines
+142
to
+147
| List<SingleAlert> alerts = singleAlertDao.findSingleAlertsByFingerprintIn(alertFingerprints); | ||
| if (alerts == null) { | ||
| return; | ||
| } | ||
| boolean hasFiringAlert = alerts.stream() | ||
| .anyMatch(alert -> CommonConstants.ALERT_STATUS_FIRING.equals(alert.getStatus())); |
Comment on lines
128
to
131
| // Save alert group | ||
| groupAlert.setAlertFingerprints(alertFingerprints.stream().toList()); | ||
| refreshGroupStatus(groupAlert); | ||
| GroupAlert savedGroupAlert = groupAlertDao.save(groupAlert); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's changed
Close #4160.
This fixes two group alert lifecycle cases:
Regression tests cover both the in-memory group reducer path and the database store path.
Verification
JAVA_HOME=$(/usr/libexec/java_home -v 25 2>/dev/null || /usr/libexec/java_home) ./mvnw -pl hertzbeat-alerter -am testgit diff --check