Skip to content
This repository has been archived by the owner on Sep 22, 2024. It is now read-only.

Noisy alerts on client and server response time metrics #780

Open
AAkindele opened this issue May 27, 2021 · 0 comments
Open

Noisy alerts on client and server response time metrics #780

AAkindele opened this issue May 27, 2021 · 0 comments
Labels
Bug Something isn't working PointDev

Comments

@AAkindele
Copy link
Contributor

AAkindele commented May 27, 2021

Alerts are firing A LOT. Initial investigation of the data suggests that the thresholds are set too low.
Example: HighClientResponseTimeWestUS2-cosmos has a threshold of 178ms. The data for the past month show that 98th percentile is around 178ms.

The alerts are the HighClientResponseTime, and HighServerResponseTime alerts for all 3 clusters.

Action Item:

  • Discuss as a team. a criteria for identifying what is the acceptable threshold for these response time alerts.
  • Use this criteria to periodically evaluate if alerts thresholds are still valid.
@AAkindele AAkindele added Bug Something isn't working PointDev labels May 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Bug Something isn't working PointDev
Projects
None yet
Development

No branches or pull requests

1 participant