-
Notifications
You must be signed in to change notification settings - Fork 3
update-readme #1849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update-readme #1849
Conversation
6daa569 to
c88b18b
Compare
| for: 15m | ||
| labels: | ||
| severity: none | ||
| severity: notify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it's related to the "update readme" PR title 😅
More seriously, the goal of an alert with severity none is that we can spot easily that something is fishy when looking at alertmanager or alerts timeline dashboard, without having it unnecessarily send a page or a Slack notif.
Is there a specific need for this change?
If yes, please don't do it hidden in an unrelated PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, no, I got a bit too eager with my change. We use this severity: none but it was not defined in the readme nor in pint.
Then I figured it would be best that only inhibitions and heartbeats have the severity none as having it set to notify for this alert would not really change anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make it send alerts to Slack, and make the alerts channels unreadable (yes, some of us read them 😁 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but then, what do we do with that? Should we make it a ticket alert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but then, what do we do with that? Should we make it a ticket alert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific alert?
Maybe just discard it.
tempo/update.sh
Outdated
| # This one fires for 2 days when a new cluster is created, | ||
| # so we don't want it to page until we run tempo all around. | ||
| rule="$(echo "$rule" \ | ||
| | sed 's/"severity": "page"/"severity": "none"/' \ | ||
| | sed 's/"severity": "page"/"severity": "notify"/' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we say we don't want it to page, I guess we don't want it to spam Slack either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, please check my other comment for the reasonning :)
|
Looks like there's 2 PRs in one:
|
|
I think it's actually 3, with the tempo alert change... I'll split this in 3 |
helm/prometheus-rules/templates/platform/atlas/alerting-rules/tempo-mixins.rules.yml
Outdated
Show resolved
Hide resolved
…tempo-mixins.rules.yml
|
@hervenicol it should be better now |
helm/prometheus-rules/templates/platform/atlas/alerting-rules/mimir.rules.yml
Outdated
Show resolved
Hide resolved
…mimir.rules.yml Co-authored-by: Hervé Nicol <[email protected]>
Before adding a new alerting rule into this repository you should consider creating an SLO rules instead.
SLO helps you both increase the quality of your monitoring and reduce the alert noise.
This PR updates and fixes the readme a bit + adds the new ticket alerts information.
It also cleans up some minor things
Checklist
oncall-kaas-cloudGitHub group).