Grafana Tempo distributed and mimir distributed pods cannot accept or send requests. #3513

hayone1 · 2025-01-06T23:14:38Z

I'll just focus on tempo for this issue as I already gave up on mimir and deployed bitnami
chart(successfully) instead.

What is the expected behavior?

Application workloads come up and can reach each other.

What do you see instead?

Connection refused in pod-service communication.

Describe the bug

There seems to a network config preventing inbound and outbound communication among the pods. eg
The below workloads won't come up because they can't reach dependent services.

- Deployment:     compactor
- Deployment:     distributor
- StatefulSet:    ingester
- Deployment:     querier
etc.

Sampling the compactor logs.

level=error ts=2025-01-06T22:41:49.341622742Z caller=memcached_client.go:183 msg="error setting memcache servers to host" host=grafana-tempo-memcached err="lookup _memcached-client._tcp.grafana-tempo-memcached on 10.43.0.10:53: read udp 10.42.5.127:47106->10.43.0.10:53: read: connection refused"
level=warn ts=2025-01-06T22:42:54.346528774Z caller=memcached_client.go:257 msg="error updating memcache servers" err="lookup _memcached-client._tcp.grafana-tempo-memcached on 10.43.0.10:53: read udp 10.42.5.127:56758->10.43.0.10:53: read: connection refused"
level=warn ts=2025-01-06T22:43:54.346734867Z caller=memcached_client.go:257 msg="error updating memcache servers" err="lookup _memcached-client._tcp.grafana-tempo-memcached on 10.43.0.10:53: read udp 10.42.5.127:55790->10.43.0.10:53: read: connection refused"
level=error ts=2025-01-06T22:44:04.402619173Z caller=main.go:122 msg="error running Tempo" err="failed to init module services: error initialising module: store: failed to create store: unexpected error from ListObjects on tempo-trace: Get \"http://monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000/tempo-trace/?delimiter=%2F&encoding-type=url&prefix=\": dial tcp: lookup monitoring-minio.cattle-monitoring-system.svc.cluster.local on 10.43.0.10:53: read udp 10.42.5.127:32852->10.43.0.10:53: read: connection refused"

The 10.43.0.10 is the IP of kube-dns service.
Coredns is up and all other non-grafana charts deployed in the cluster do not have this issue
including mimir from bitnami.
A curl to monitoring-minio from any other pod with curl gives something similar to the below which shows a successful connection

> curl -v http://monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000/tempo-trace/?delimiter=%2F&encoding-type=url&prefix=
* Host monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000 was resolved.
* IPv6: (none)
* IPv4: 10.43.89.23
*   Trying 10.43.89.23:9000...
* Connected to monitoring-minio.cattle-monitoring-system.svc.cluster.local (10.43.89.23) port 9000
> GET /tempo-trace/?delimiter=%2F HTTP/1.1
> Host: monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000
> User-Agent: curl/8.6.0
> Accept: */*
> 
< HTTP/1.1 403 Forbidden
< ...truncated...
<?xml version="1.0" encoding="UTF-8"?>
* Connection #0 to host monitoring-minio.cattle-monitoring-system.svc.cluster.local left intact
<Error><Code>AccessDenied</Code><Message>Access Denied.</Message><BucketName>tempo-trace</BucketName><Resource>/tempo-trace/</Resource><RequestId>18183C29DDA7405B</RequestId><Hos
>

Even the pods that are active like gateway cannot reach any service/pod and cannot be reached either eg.
From within gateway pod.

~ $ curl -v http://monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000/tempo-trace/?delimiter=%2F&encoding-type=url&prefix=
~ $ /bin/sh: encoding-type=url: not found
* Could not resolve host: monitoring-minio.cattle-monitoring-system.svc.cluster.local
* Could not resolve host: monitoring-minio.cattle-monitoring-system.svc.cluster.local
* closing connection #0
curl: (6) Could not resolve host: monitoring-minio.cattle-monitoring-system.svc.cluster.local

[2]+  Done(127)                  encoding-type=url
[1]+  Done(6)                    curl -v http://monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000/tempo-trace/?delimiter=%2F

Reaching memcached From a 3rd party pod

> curl -v grafana-tempo-memcached.cattle-monitoring-system.svc.cluster.local:11211
* Host grafana-tempo-memcached.cattle-monitoring-system.svc.cluster.local:11211 was resolved.
* IPv6: (none)
* IPv4: 10.43.233.164
*   Trying 10.43.233.164:11211...
* connect to 10.43.233.164 port 11211 from 10.42.2.17 port 42042 failed: Connection refused
* Failed to connect to grafana-tempo-memcached.cattle-monitoring-system.svc.cluster.local port 11211 after 12 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to grafana-tempo-memcached.cattle-monitoring-system.svc.cluster.local port 11211 after 12 ms: Couldn't connect to server

Reaching bitnami mimir's memcached however (from a 3rd party pod) is not a problem—whereas it's also a connection refused error when i tried it for grafana's mimir-distributed.

> curl -v bitnami-mimir-memcachedchunks.cattle-monitoring-system.svc.cluster.local:11211
* Host bitnami-mimir-memcachedchunks.cattle-monitoring-system.svc.cluster.local:11211 was resolved.
* IPv6: (none)
* IPv4: 10.43.126.169
*   Trying 10.43.126.169:11211...
* Connected to bitnami-mimir-memcachedchunks.cattle-monitoring-system.svc.cluster.local (10.43.126.169) port 11211
> GET / HTTP/1.1
> Host: bitnami-mimir-memcachedchunks.cattle-monitoring-system.svc.cluster.local:11211
> User-Agent: curl/8.6.0
> Accept: */*
> 
* Empty reply from server
* Closing connection
curl: (52) Empty reply from server

Cluster Details

architecture: amd64
version: v1.31.3
provider: k3s

Chart Details

chart:
      name: tempo-distributed
      repository: https://grafana.github.io/helm-charts
      version: '1.28.0'

Helm Values

global:
  image:
    registry: docker.io
    pullSecrets: []
  priorityClassName: null
  clusterDomain: 'cluster.local.'
  dnsService: 'kube-dns'
  dnsNamespace: 'kube-system'
  extraEnv:
    - name: MINIO_ROOT_USER
      valueFrom:
        secretKeyRef:
          name: monitoring-minio-secret
          key: MINIO_ROOT_USER
    - name: MINIO_ROOT_PASSWORD
      valueFrom:
        secretKeyRef:
          name: monitoring-minio-secret
          key: MINIO_ROOT_PASSWORD
fullnameOverride: ''
useExternalConfig: false
configStorageType: ConfigMap
externalConfigSecretName: '{{ include "tempo.resourceName" (dict "ctx" . "component" "config") }}'
externalRuntimeConfigName: '{{ include "tempo.resourceName" (dict "ctx" . "component" "runtime") }}'
externalConfigVersion: '0'
reportingEnabled: true
tempo:
  image:
    registry: docker.io
    pullSecrets: []
    repository: grafana/tempo
    tag: null
    pullPolicy: IfNotPresent
  readinessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 30
    timeoutSeconds: 1
  podLabels: {}
  podAnnotations: {}
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    allowPrivilegeEscalation: false
    capabilities:
      drop:
        - ALL
    readOnlyRootFilesystem: true
  podSecurityContext:
    fsGroup: 1000
  structuredConfig: {}
  memberlist:
    appProtocol: null
    service:
      annotations: {}
  service:
    ipFamilies:
      - 'IPv4'
    ipFamilyPolicy: 'SingleStack'
serviceAccount:
  create: true
  name: null
  imagePullSecrets: []
  annotations: {}
  automountServiceAccountToken: false
rbac:
  create: true
  pspEnabled: false
ingester:
  annotations: {}
  replicas: 2
  hostAliases: []
  initContainers: []
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 3
    behavior: {}
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage:
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources: {}
  terminationGracePeriodSeconds: 300
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "ingester") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "ingester") | nindent 12 }}
            topologyKey: kubernetes.io/hostname
        - weight: 75
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "ingester") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  nodeSelector: {}
  tolerations: []
  extraVolumeMounts: []
  extraVolumes: []
  persistence:
    enabled: true
    inMemory: false
    size: 5Gi
    storageClass: longhorn
    annotations: {}
  persistentVolumeClaimRetentionPolicy:
    enabled: false
    whenScaled: Retain
    whenDeleted: Retain
  config:
    replication_factor: 3
    trace_idle_period: null
    flush_check_period: null
    max_block_bytes: null
    max_block_duration: null
    complete_block_timeout: null
    flush_all_on_shutdown: false
  service:
    annotations: {}
    type: ClusterIP
    internalTrafficPolicy: Cluster
  appProtocol:
    grpc: null
  zoneAwareReplication:
    enabled: false
    maxUnavailable: 50
    topologyKey: null
    zones:
      - name: zone-a
        nodeSelector: null
        extraAffinity: {}
        storageClass: null
      - name: zone-b
        nodeSelector: null
        extraAffinity: {}
        storageClass: null
      - name: zone-c
        nodeSelector: null
        extraAffinity: {}
        storageClass: null
metricsGenerator:
  enabled: false
  kind: Deployment
  annotations: {}
  replicas: 1
  hostAliases: []
  initContainers: []
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources:
    requests:
      cpu: 50m
      memory: 64Mi
  terminationGracePeriodSeconds: 300
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "metrics-generator") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "metrics-generator") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "metrics-generator") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  persistence:
    enabled: true
    size: 5Gi
    storageClass: longhorn
    annotations: {}
  walEmptyDir: {}
    #
    #
  extraVolumeMounts: []
  extraVolumes: []
  persistentVolumeClaimRetentionPolicy:
    enabled: false
    whenScaled: Retain
    whenDeleted: Retain
  ports:
    - name: grpc
      port: 9095
      service: true
    - name: http-memberlist
      port: 7946
      service: false
    - name: http-metrics
      port: 3100
      service: true
  config:
    registry:
      collection_interval: 15s
      external_labels: {}
      stale_duration: 15m
    processor:
      service_graphs:
        dimensions: []
        histogram_buckets: [0.1, 0.2, 0.4, 0.8, 1.6, 3.2, 6.4, 12.8]
        max_items: 10000
        wait: 10s
        workers: 10
      span_metrics:
        dimensions: []
        histogram_buckets: [0.002, 0.004, 0.008, 0.016, 0.032, 0.064, 0.128, 0.256, 0.512, 1.02, 2.05, 4.10]
    storage:
      path: /var/tempo/wal
      wal:
      remote_write_flush_deadline: 1m
      remote_write_add_org_id_header: true
      remote_write: []
    traces_storage:
      path: /var/tempo/traces
    metrics_ingestion_time_range_slack: 30s
  service:
    annotations: {}
  appProtocol:
    grpc: null
distributor:
  replicas: 1
  hostAliases: []
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 3
    behavior: {}
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage:
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  service:
    annotations: {}
    labels: {}
    type: ClusterIP
    loadBalancerIP: ''
    loadBalancerSourceRanges: []
    externalTrafficPolicy: null
    internalTrafficPolicy: Cluster
  serviceDiscovery:
    annotations: {}
    labels: {}
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 1000m
      memory: 1024Mi
  terminationGracePeriodSeconds: 30
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "distributor") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "distributor") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "distributor") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  extraVolumeMounts: []
  extraVolumes: []
  config:
    log_received_traces: null
    log_received_spans:
      enabled: false
      include_all_attributes: false
      filter_by_status_error: false
    extend_writes: null
  appProtocol:
    grpc: null
compactor:
  replicas: 1
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 3
    hpa:
      enabled: true
      behavior: {}
      targetCPUUtilizationPercentage: 100
      targetMemoryUtilizationPercentage: 100
    keda:
      enabled: false
      triggers: []
  hostAliases: []
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "compactor") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "compactor") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 1000m
      memory: 1024Mi
  terminationGracePeriodSeconds: 30
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  extraVolumeMounts: []
  extraVolumes: []
  config:
    compaction:
      block_retention: 48h
      compacted_block_retention: 1h
      compaction_window: 1h
      v2_in_buffer_bytes: 5242880
      v2_out_buffer_bytes: 20971520
      max_compaction_objects: 6000000
      max_block_bytes: 107374182400
      retention_concurrency: 10
      v2_prefetch_traces_count: 1000
      max_time_per_tenant: 5m
      compaction_cycle: 30s
  service:
    annotations: {}
  dnsConfigOverides:
    enabled: false
    dnsConfig:
      options:
        - name: ndots
          value: "3"    
querier:
  replicas: 1
  hostAliases: []
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 3
    behavior: {}
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage: 100
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources:
    requests:
      memory: "128Mi"
      cpu: "100m"
    limits:
      memory: "1024Mi"
      cpu: "1000m"
  terminationGracePeriodSeconds: 30
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "querier") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "querier" "memberlist" true) | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "querier" "memberlist" true) | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  extraVolumeMounts: []
  extraVolumes: []
  config:
    frontend_worker:
      grpc_client_config: {}
    trace_by_id:
      query_timeout: 10s
    search:
      query_timeout: 30s
      prefer_self: 10
      external_hedge_requests_at: 8s
      external_hedge_requests_up_to: 2
      external_endpoints: []
      external_backend: ""
      google_cloud_run: {}
    max_concurrent_queries: 20
  service:
    annotations: {}
  appProtocol:
    grpc: null
queryFrontend:
  query:
    enabled: false
    image:
      registry: null
      pullSecrets: []
      repository: grafana/tempo-query
      tag: null
    resources: {}
    extraArgs: []
    extraEnv: []
    extraEnvFrom: []
    extraVolumeMounts: []
    extraVolumes: []
    config: |
      backend: 127.0.0.1:3100
  replicas: 1
  hostAliases: []
  config:
    max_outstanding_per_tenant: 2000
    max_retries: 2
    search:
      concurrent_jobs: 1000
      target_bytes_per_job: 104857600
    trace_by_id:
      query_shards: 50
    metrics:
      concurrent_jobs: 1000
      target_bytes_per_job: 104857600
      max_duration: 3h
      query_backend_after: 30m
      interval: 5m
      duration_slo: 0s
      throughput_bytes_slo: 0
  autoscaling:
    enabled: false
    minReplicas: 1
    maxReplicas: 3
    behavior: {}
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage:
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  service:
    port: 16686
    annotations: {}
    labels: {}
    type: ClusterIP
    loadBalancerIP: ""
    loadBalancerSourceRanges: []
  serviceDiscovery:
    annotations: {}
    labels: {}
  ingress:
    enabled: false
    annotations: {}
    hosts:
      - host: query.tempo.example.com
        paths:
          - path: /
    tls:
      - secretName: tempo-query-tls
        hosts:
          - query.tempo.example.com
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources: {}
  terminationGracePeriodSeconds: 30
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "query-frontend") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "query-frontend") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "query-frontend") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  extraVolumeMounts: []
  extraVolumes: []
  appProtocol:
    grpc: null
enterpriseFederationFrontend:
  enabled: false
  replicas: 1
  hostAliases: []
  proxy_targets: []
  autoscaling:
    enabled: false
    minReplicas: 1
    maxReplicas: 3
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage:
  image:
    registry: null
    pullSecrets: []
    repository: null
    tag: null
  service:
    port: 3100
    annotations: {}
    type: ClusterIP
    loadBalancerIP: ""
    loadBalancerSourceRanges: []
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  resources: {}
  terminationGracePeriodSeconds: 30
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: failure-domain.beta.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "federation-frontend") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "federation-frontend") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "federation-frontend") | nindent 12 }}
            topologyKey: failure-domain.beta.kubernetes.io/zone
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  extraVolumeMounts: []
  extraVolumes: []
multitenancyEnabled: false
rollout_operator:
  enabled: false
  podSecurityContext:
    fsGroup: 10001
    runAsGroup: 10001
    runAsNonRoot: true
    runAsUser: 10001
    seccompProfile:
      type: RuntimeDefault
  securityContext:
    readOnlyRootFilesystem: true
    capabilities:
      drop: [ALL]
    allowPrivilegeEscalation: false
traces:
  jaeger:
    grpc:
      enabled: true
      receiverConfig: {}
    thriftBinary:
      enabled: false
      receiverConfig: {}
    thriftCompact:
      enabled: false
      receiverConfig: {}
    thriftHttp:
      enabled: true
      receiverConfig: {}
  zipkin:
    enabled: false
    receiverConfig: {}
  otlp:
    http:
      enabled: true
      receiverConfig: {}
    grpc:
      enabled: true
      receiverConfig: {}
  opencensus:
    enabled: false
    receiverConfig: {}
  kafka: {}
memberlist:
  node_name: ""
  cluster_label: "{{ .Release.Name }}.{{ .Release.Namespace }}"
  randomize_node_name: true
  stream_timeout: "10s"
  retransmit_factor: 2
  pull_push_interval: "30s"
  gossip_interval: "1s"
  gossip_nodes: 2
  gossip_to_dead_nodes_time: "30s"
  min_join_backoff: "1s"
  max_join_backoff: "1m"
  max_join_retries: 10
  abort_if_cluster_join_fails: false
  rejoin_interval: "0s"
  left_ingesters_timeout: "5m"
  leave_timeout: "5s"
  bind_addr: []
  bind_port: 7946
  packet_dial_timeout: "5s"
  packet_write_timeout: "5s"
server:
  httpListenPort: 3100
  logLevel: info
  logFormat: logfmt
  grpc_server_max_recv_msg_size: 4194304
  grpc_server_max_send_msg_size: 4194304
  http_server_read_timeout: 30s
  http_server_write_timeout: 30s
cache:
  caches:
    - memcached:
        host: '{{ include "tempo.fullname" . }}-memcached'
        service: memcached-client
        consistent_hash: true
        timeout: 500ms
      roles:
        - parquet-footer
        - bloom
        - frontend-search
storage:
  trace:
    block:
      version: null
      dedicated_columns: []
    backend: s3
    s3:
      bucket: tempo-trace
      endpoint: "monitoring-minio.cattle-monitoring-system.svc.cluster.local:9000"
      region: us-east-1
      access_key: ${MINIO_ROOT_USER}
      secret_key: ${MINIO_ROOT_PASSWORD}
      insecure: true
      tls_insecure_skip_verify: true
    pool:
      max_workers: 400
      queue_depth: 20000
  admin:
    backend: filesystem
global_overrides:
  per_tenant_override_config: /runtime-config/overrides.yaml
overrides: {}
memcached:
  enabled: true
  image:
    registry: null
    pullSecrets: []
    repository: memcached
    tag: 1.6.33-alpine
    pullPolicy: IfNotPresent
  host: memcached
  replicas: 1
  extraArgs: []
  tolerations: []
  extraEnv: []
  extraEnvFrom: []
  podLabels: {}
  podAnnotations: {}
  resources:
    memory: "128Mi"
    cpu: "100m"
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "memcached") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "memcached") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "memcached") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  maxUnavailable: 1
  extraVolumeMounts: []
  extraVolumes: []
  service:
    annotations: {}
memcachedExporter:
  enabled: true
  hostAliases: []
  image:
    registry: null
    pullSecrets: []
    repository: prom/memcached-exporter
    tag: v0.14.4
    pullPolicy: IfNotPresent
  resources: {}
  extraArgs: []
metaMonitoring:
  serviceMonitor:
    enabled: true
    namespace: null
    namespaceSelector: {}
    annotations: {}
    labels: {}
    interval: null
    scrapeTimeout: null
    relabelings: []
    metricRelabelings: []
    scheme: http
    tlsConfig: null
  grafanaAgent:
    enabled: false
    installOperator: false
    logs:
      remote:
        url: ''
        auth:
          tenantId: ''
          username: ''
          passwordSecretName: ''
          passwordSecretKey: ''
      additionalClientConfigs: []
    metrics:
      remote:
        url: ''
        headers: {}
        auth:
          username: ''
          passwordSecretName: ''
          passwordSecretKey: ''
      additionalRemoteWriteConfigs: []
      scrapeK8s:
        enabled: true
        kubeStateMetrics:
          namespace: kube-system
          labelSelectors:
            app.kubernetes.io/name: kube-state-metrics
    namespace: ''
    labels: {}
    annotations: {}
prometheusRule:
  enabled: true
  namespace: null
  annotations: {}
  labels: {}
  groups: []
minio:
  enabled: false
  mode: standalone
  rootUser: grafana-tempo
  rootPassword: supersecret
  buckets:
    - name: tempo-traces
      policy: none
      purge: false
    - name: enterprise-traces
      policy: none
      purge: false
    - name: enterprise-traces-admin
      policy: none
      purge: false
  persistence:
    size: 5Gi
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
  configPathmc: '/tmp/minio/mc/'
gateway:
  enabled: true
  replicas: 1
  hostAliases: []
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 3
    behavior: {}
    targetCPUUtilizationPercentage: 60
    targetMemoryUtilizationPercentage: 100
  verboseLogging: true
  image:
    registry: null
    pullSecrets: []
    repository: nginxinc/nginx-unprivileged
    tag: 1.27-alpine
    pullPolicy: IfNotPresent
  priorityClassName: null
  podLabels: {}
  podAnnotations: {}
  extraArgs: []
  extraEnv: []
  extraEnvFrom: []
  extraVolumes: []
  extraVolumeMounts: []
  resources:
    requests:
      memory: "64Mi"
      cpu: "100m"
    limits:
      memory: "1024Mi"
      cpu: "1500m"
  terminationGracePeriodSeconds: 30
  topologySpreadConstraints: |
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: ScheduleAnyway
      labelSelector:
        matchLabels:
          {{- include "tempo.selectorLabels" (dict "ctx" . "component" "gateway") | nindent 6 }}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              {{- include "tempo.selectorLabels" (dict "ctx" . "component" "gateway") | nindent 10 }}
          topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                {{- include "tempo.selectorLabels" (dict "ctx" . "component" "gateway") | nindent 12 }}
            topologyKey: topology.kubernetes.io/zone
  maxUnavailable: 1
  nodeSelector: {}
  tolerations: []
  service:
    port: 80
    type: ClusterIP
    clusterIP: null
    nodePort: null
    loadBalancerIP: null
    annotations: {}
    labels: {}
    additionalPorts: []
  ingress:
    enabled: false
    labels: {}
    annotations: {}
    hosts:
      - host: gateway.tempo.example.com
        paths:
          - path: /
    tls:
      - secretName: tempo-gateway-tls
        hosts:
          - gateway.tempo.example.com
  basicAuth:
    enabled: false
    username: null
    password: null
    htpasswd: >-
      {{ htpasswd (required "'gateway.basicAuth.username' is required" .Values.gateway.basicAuth.username) (required "'gateway.basicAuth.password' is required" .Values.gateway.basicAuth.password) }}
    existingSecret: null
  readinessProbe:
    httpGet:
      path: /
      port: http-metrics
    initialDelaySeconds: 15
    timeoutSeconds: 1
  nginxConfig:
    logFormat: |-
      main '$remote_addr - $remote_user [$time_local]  $status '
              '"$request" $body_bytes_sent "$http_referer" '
              '"$http_user_agent" "$http_x_forwarded_for"';
    serverSnippet: ''
    httpSnippet: ''
    resolver: ''
##############################################################################

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grafana Tempo distributed and mimir distributed pods cannot accept or send requests. #3513

Grafana Tempo distributed and mimir distributed pods cannot accept or send requests. #3513

hayone1 commented Jan 6, 2025

Grafana Tempo distributed and mimir distributed pods cannot accept or send requests. #3513

Grafana Tempo distributed and mimir distributed pods cannot accept or send requests. #3513

Comments

hayone1 commented Jan 6, 2025

What is the expected behavior?

What do you see instead?

Describe the bug

Cluster Details

Chart Details

Helm Values