Ingesters not equally balanced across cluster #5329

DanielKhaikin · 2025-06-26T17:29:28Z

DanielKhaikin
Jun 26, 2025

I deployed Grafana Tempo 2.8.1 on k8s using the latest chart, with 15 ingesters with limit of 30gb memory each pod. For some reason my ingesters memory usage is not balanced at all, some pods are close to 30gb and get OOMKilled and some don't even cross the 10gb.
After some reading about the consistent hash ring of the ingesters and distributors I found out that the communication between the distributors and the ingesters is gRPC, and in k8s there is a problem using gRPC. Is there any option to change it to otlphttp or some other fix to this issue?

joe-elliott · 2025-07-04T18:34:47Z

joe-elliott
Jul 4, 2025
Maintainer

Ingest imbalance can be impacted by a lot of factors. Traces are sharded by tenant / trace id using the consistent hash ring so this is not a gRPC load balancing issue.

Things that occur to me off the top of my head that can impact imbalances are tenant ring sizes, long running/large traces, and poorly distributed trace ids. One option would be to increase the number of tokens an ingester puts in the ring:

ingester:
    lifecycler:
        ring:
          num_tokens: 128 // default

This likely has more to do with data write patterns then token count, but it's worth a shot i suppose. Also, we prefer more smaller ingesters to fewer larger. 30GB ingesters are quite big. I'd scale out til your are <10GB each.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ingesters not equally balanced across cluster #5329

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ingesters not equally balanced across cluster #5329

Uh oh!

Uh oh!

DanielKhaikin Jun 26, 2025

Replies: 1 comment

Uh oh!

joe-elliott Jul 4, 2025 Maintainer

DanielKhaikin
Jun 26, 2025

joe-elliott
Jul 4, 2025
Maintainer