You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am migrating our metrics platform from Prometheus to Mimir + Alloy (deployment).
Initially, after deploying Alloy into our EKS cluster, everything seemed to work great.
However, after about 30 minutes, I started seeing a lot of err-mimir-duplicate-label-names errors for random metrics and label names.
Additionally, some metrics have significantly different values compared to our existing Prometheus system's output.
Also, I found a similar issue #1006, so I tried two ways to resolve my issue, but both failed:
After these tests, I switched back to our existing Prometheus with remote_write into Mimir, and Mimir's metrics are looking so far so good. Therefore, I believe Mimir's configuration is correct.
Steps to reproduce
deploy alloy, mimir-distributed helm chart into EKS
alloy configuration setting: scrape metrics from Prometheus CRDs
atfer running about 30m, start log err-mimir-duplicate-label-names errors
System information
EKS 1.29
Software version
Alloy v1.4.3 & Alloy v1.5.1 / Mimir 2.14.0
Configuration
alloy:
configMap:
# -- Create a new ConfigMap for the config file.
create: true
# -- Content to assign to the new ConfigMap. This is passed into `tpl` allowing for templating from values.
content: |-
logging {
level = "warn"
format = "json"
}
prometheus.remote_write "mimir" {
// Send metrics to a Mimir instance
endpoint {
url = "http://_http-metrics._tcp.mimir-gateway.metrics.svc.cluster.local/api/v1/push"
queue_config {
sample_age_limit = "5m"
}
}
}
// import the service monitor
prometheus.operator.servicemonitors "services" {
forward_to = [prometheus.remote_write.mimir.receiver]
// this is the default scrape interval for all service monitors
// decrease this value will increase the load on the Mimir write path
scrape {
default_scrape_interval = "60s"
}
clustering {
enabled = true
}
}
// import the pod monitor
prometheus.operator.podmonitors "pods" {
forward_to = [prometheus.remote_write.mimir.receiver]
// this is the default scrape interval for all pod monitors
// decrease this value will increase the load on the Mimir write path
scrape {
default_scrape_interval = "60s"
}
clustering {
enabled = true
}
}
// import the prometheus rules
mimir.rules.kubernetes "rules" {
address = "http://_http-metrics._tcp.mimir-gateway.metrics.svc.cluster.local/"
}
clustering:
# -- Deploy Alloy in a cluster to allow for load distribution.
enabled: true
extraEnv:
- name: "GOMEMLIMT"
value: "1.8GiB"
- name: "GOGC"
value: "95"
resources:
requests:
cpu: "200m"
memory: "3Gi"
limits:
cpu: "1"
memory: "3Gi"
image:
# -- Grafana Alloy image registry (defaults to docker.io)
registry: "docker.io"
# -- Grafana Alloy image repository.
repository: grafana/alloy
# -- (string) Grafana Alloy image tag. When empty, the Chart's appVersion is
# used.
tag: v1.5.1
controller:
# -- Type of controller to use for deploying Grafana Alloy in the cluster.
# Must be one of 'daemonset', 'deployment', or 'statefulset'.
type: 'deployment'
# -- Number of pods to deploy. Ignored when controller.type is 'daemonset'.
replicas: 4
# -- PodDisruptionBudget configuration.
podDisruptionBudget:
# -- Whether to create a PodDisruptionBudget for the controller.
enabled: true
# -- Maximum number of pods that can be unavailable during a disruption.
# Note: Only one of minAvailable or maxUnavailable should be set.
maxUnavailable: 1
Logs
server returned HTTP status 400 Bad Request: received a series with duplicate label name, label: 'status' series: 'nginx_ingress_controller_bytes_sent_sum{container=\"controller\", controller_class=\"k8s.io/ingress-nginx\", controller_namespace=\"ingress-nginx\", controller_pod=\"ingress-nginx-service-controller-****\", ' (err-mimir-duplicate-label-names)
server returned HTTP status 400 Bad Request: received a series with duplicate label name, label: 'zone' series: 'coredns_dns_request_size_bytes_count{container="node-cache", endpoint="metrics", instance="******:9253", job="node-local-dns", namespace="kube-system", pod="node-local-dns-***", proto="udp", ' (err-mimir-duplicate-label-names)
The text was updated successfully, but these errors were encountered:
What's wrong?
I am migrating our metrics platform from Prometheus to Mimir + Alloy (deployment).
Initially, after deploying Alloy into our EKS cluster, everything seemed to work great.
However, after about 30 minutes, I started seeing a lot of err-mimir-duplicate-label-names errors for random metrics and label names.
Additionally, some metrics have significantly different values compared to our existing Prometheus system's output.
Also, I found a similar issue #1006, so I tried two ways to resolve my issue, but both failed:
instance
label #1009After these tests, I switched back to our existing Prometheus with remote_write into Mimir, and Mimir's metrics are looking so far so good. Therefore, I believe Mimir's configuration is correct.
Steps to reproduce
err-mimir-duplicate-label-names
errorsSystem information
EKS 1.29
Software version
Alloy v1.4.3 & Alloy v1.5.1 / Mimir 2.14.0
Configuration
Logs
The text was updated successfully, but these errors were encountered: