Ingesters not equally balanced across cluster #5329
Unanswered
DanielKhaikin
asked this question in
Q&A
Replies: 1 comment
-
Ingest imbalance can be impacted by a lot of factors. Traces are sharded by tenant / trace id using the consistent hash ring so this is not a gRPC load balancing issue. Things that occur to me off the top of my head that can impact imbalances are tenant ring sizes, long running/large traces, and poorly distributed trace ids. One option would be to increase the number of tokens an ingester puts in the ring:
This likely has more to do with data write patterns then token count, but it's worth a shot i suppose. Also, we prefer more smaller ingesters to fewer larger. 30GB ingesters are quite big. I'd scale out til your are <10GB each. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I deployed Grafana Tempo 2.8.1 on k8s using the latest chart, with 15 ingesters with limit of 30gb memory each pod. For some reason my ingesters memory usage is not balanced at all, some pods are close to 30gb and get OOMKilled and some don't even cross the 10gb.
After some reading about the consistent hash ring of the ingesters and distributors I found out that the communication between the distributors and the ingesters is gRPC, and in k8s there is a problem using gRPC. Is there any option to change it to otlphttp or some other fix to this issue?
Beta Was this translation helpful? Give feedback.
All reactions