-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigation: Migrate the existing alerts to mimir alertmanager #3746
Comments
I did manage to deploy Mimir's Alertmanager and configure it in Grafana, but I have not yet been able to load alerts in it. Here are the steps taken so far:
Mimir values diff--- values-mimir.original.yaml 2024-11-13 10:59:57.206365528 +0100
+++ values-mimir.yaml 2024-11-14 09:13:24.947664850 +0100
@@ -1,3 +1,4 @@
+USER-SUPPLIED VALUES:
hpa:
distributor:
enabled: true
@@ -36,6 +37,8 @@
enabled: true
image:
repository: gsoci.azurecr.io/giantswarm/mimir-continuous-test
+ alertmanager:
+ enabled: true
distributor:
replicas: 1
resources:
@@ -102,6 +105,36 @@
value: golem
- key: name
value: giantswarm-golem-mimir-ruler
+ - apiVersion: objectstorage.giantswarm.io/v1alpha1
+ kind: Bucket
+ metadata:
+ annotations:
+ meta.helm.sh/release-name: mimir
+ meta.helm.sh/release-namespace: mimir
+ labels:
+ app.kubernetes.io/instance: mimir-common
+ app.kubernetes.io/managed-by: Helm
+ app.kubernetes.io/name: mimir-common
+ application.giantswarm.io/team: atlas
+ name: giantswarm-golem-mimir-common
+ namespace: mimir
+ spec:
+ accessRole:
+ extraBucketNames:
+ - giantswarm-golem-mimir
+ roleName: giantswarm-golem-mimir
+ serviceAccountName: mimir
+ serviceAccountNamespace: mimir
+ expirationPolicy:
+ days: 100
+ name: giantswarm-golem-mimir-common
+ tags:
+ - key: app
+ value: mimir
+ - key: installation
+ value: golem
+ - key: name
+ value: giantswarm-golem-mimir-common
gateway:
autoscaling:
enabled: true
@@ -186,6 +219,7 @@
storage:
backend: s3
s3:
+ bucket_name: giantswarm-golem-mimir-common
endpoint: s3.eu-west-2.amazonaws.com
region: eu-west-2
distributor:
@@ -208,7 +242,7 @@
ruler_max_rule_groups_per_tenant: 0
ruler_max_rules_per_rule_group: 0
ruler:
- alertmanager_url: http://alertmanager-operated.monitoring:9093
+ alertmanager_url: "http://mimir-alertmanager.mimir.svc:8080/alertmanager"
ruler_storage:
s3:
bucket_name: giantswarm-golem-mimir-ruler Grafana values diff--- values-grafana.original.yaml 2024-11-13 09:36:10.332740268 +0100
+++ values-grafana.yaml 2024-11-14 10:52:40.833635135 +0100
@@ -72,11 +72,14 @@
- name: Mimir Alertmanager
type: alertmanager
uid: mimir-alertmanager
- url: http://mimir-alertmanager.mimir.svc/alertmanager
+ url: http://mimir-alertmanager.mimir.svc:8080/alertmanager
access: proxy
jsonData:
handleGrafanaManagedAlerts: false
- implementation: mimir
+ implementation: prometheus
+ httpHeaderName1: X-Scope-OrgID
+ secureJsonData:
+ httpHeaderValue1: 1
kind: ConfigMap
metadata:
annotations: |
We decided to use a dedicated service account for
... As well as the
However we encountered 2 issues :
In order for us to be able to go on in that direction without using too much workarounds, we'll need to wait for mimir helm chart's next release. |
Do we need a custom bucket for this? I think we could use the ruler bucket right? |
The issue is the same : we need the next mimir release because even the possibility of choosing the serviceAccount for the
I think it's better to put some data segregation, as much on a logical level as on a security point of view. Concerning the tests done on Ujnfortunately, even though the pod runs flawlessly, no notification policies nor any contact points are displayed on |
Did you check thé datasource works? Maybe thé grafana logs Can help? |
So after checking on Grafana UI, the mimir-alertmanager datasource isn't working so I manually created one that's working. However even with this working datasource, there are still no contact points or notification policies associated with it and neither the mimir-alertmanager pod nor the grafana one give useful insight on why. |
This is not going to work :D ruler: Not it makes sense that we have no contact points as we currently do not have an alert template configured See: alertmanager_fallback_config.yaml: | |
Yeah I noticed that in the meantime and moved the |
It's a default alertmanager config so the one on the old alertmanager should work |
Motivation
We already have a long set of alerts that are managed by the prometheus alertmanager. We need to make sure they will also work in the mimir alertmanager.
Todo
Outcome
The text was updated successfully, but these errors were encountered: