Compaction issues - tempo 2.7 - outstanding blocks #4781

edgarkz · 2025-03-03T09:06:01Z

edgarkz
Mar 3, 2025

Hello,

We are running Tempo in distributed mode with six compactors. Typically, on weekends, the compactors manage to reduce outstanding blocks to zero. However, over the past two weeks, we've noticed that the compactors are consuming significantly more CPU, and some blocks take much longer to compact compared to the usual 3-4 minutes.

level=info ts=2025-03-03T06:54:02.92209413Z caller=compactor.go:352 msg="compaction complete" elapsed=27m46.826854413s blockID=f105d336-c034-4298-a6d6-bfcbe45d7f93 level=info ts=2025-03-03T06:54:02.924052556Z caller=compactor.go:181 msg="compacted blocks for a maintenance cycle, bailing out" tenantID=single-tenant

We are operating in a single-tenant setup with the following configuration:

`
global_overrides:
defaults:
global:
max_bytes_per_trace: 50000000
ingestion:
burst_size_bytes: 90000000
rate_limit_bytes: 150000000
max_traces_per_user: 450000
read:
max_bytes_per_tag_values_query: 5000000
,
Compactor:

      replicas: 6
      resources:
        limits:
          cpu: 4
          memory: 8Gi
        requests:
          cpu: 1
          memory: 8Gi        
      extraEnv:
        - name: GOMEMLIMIT 
          value: 7500MiB                 
      config:
        compaction:
          block_retention: 168h
          compaction_window: 10m
          max_block_bytes: 10737418240 
          max_time_per_tenant: 15m

`

I’d appreciate your guidance on how to approach this issue. Scaling up by adding more compactors seems like an easy solution, but I’d like to understand why CPU usage has become consistently high instead of being limited to specific timeframes during compaction, as it was before.

Compactors cpu+ memory usage:

Answered by joe-elliott

Mar 5, 2025

50MB is quite big. It's definitely possible that someone started sending more, large traces then before which is causing your issue. I'd review metrics like tempo_distributor_spans_received_total, tempo_distributor_bytes_received_total, tempo_ingester_traces_created_total and tempo_ingester_bytes_received_total to see if a write pattern changed.

Could you review our compactor configuration and suggest any adjustments to better optimize our single-tenant setup?

v2_ settings don't matter. they apply to the old v2 backend only
We have max_compaction_objects set to 3000000 and max_block_bytes set to 4GB. Don't know if those will help
The other settings shouldn't really matter. max_time_per…

View full answer

joe-elliott · 2025-03-03T19:59:22Z

joe-elliott
Mar 3, 2025
Maintainer

That's difficult to say. My initial guesses are that something about your write pattern changed. The two things we've seen that impact compactor resource usage the most are trace size and large, high cardinality attributes.

For trace size you should consider setting max trace size limitations on your tenants. For high cardinality attributes you should consider setting up dedicated columns although I'm guessing you've already done that. Perhaps a write pattern changed and you need to re-up your dedicated columns? Also, Tempo 2.7 added the ability to truncate span attributes to prevent the consumption of enormous attributes and 2.8 will apply this setting to all scopes

Also, getting outstanding blocks to 0 is not necessary for a happy/functioning Tempo, but I do understand your curiosity given the change in behavior.

0 replies

edgarkz · 2025-03-05T13:21:24Z

edgarkz
Mar 5, 2025
Author

Hi Joe,

We have configured dedicated columns and set the maximum trace size to 50MB.

I have reduced the max_span_attr_byte to 1024 hopefully keeping our spans more lightweight.

Could you review our compactor configuration and suggest any adjustments to better optimize our single-tenant setup?
6 compactors with 8gb memory \ 1 cpu

v2_in_buffer_bytes: 5242880
v2_out_buffer_bytes: 20971520
v2_prefetch_traces_count: 1000
compaction_window: 10m0s
max_compaction_objects: 6000000
max_block_bytes: 10737418240
block_retention: 168h0m0s
compacted_block_retention: 1h0m0s
retention_concurrency: 10
max_time_per_tenant: 15m0s
compaction_cycle: 30s
override_ring_key: compactor

0 replies

joe-elliott · 2025-03-05T15:48:44Z

joe-elliott
Mar 5, 2025
Maintainer

50MB is quite big. It's definitely possible that someone started sending more, large traces then before which is causing your issue. I'd review metrics like tempo_distributor_spans_received_total, tempo_distributor_bytes_received_total, tempo_ingester_traces_created_total and tempo_ingester_bytes_received_total to see if a write pattern changed.

Could you review our compactor configuration and suggest any adjustments to better optimize our single-tenant setup?

v2_ settings don't matter. they apply to the old v2 backend only
We have max_compaction_objects set to 3000000 and max_block_bytes set to 4GB. Don't know if those will help
The other settings shouldn't really matter. max_time_per_tenant and compaction_cycle only matter if you are operating in a multi tenant environment. Ours are set to compaction_cycle: 500ms. max_time_per_tenant: 20m

2 replies

milon619 Apr 9, 2025

@joe-elliott max_compaction_objects is marked as deprecated in the official Tempo doc. Is it still honoured along with max_block_bytes ?

joe-elliott Apr 11, 2025
Maintainer

That's interesting. I didn't realize we had marked it deprecated. We should probably remove it then :).

But, yes, as of right now it is honored along with max_block_bytes

edgarkz · 2025-03-06T12:31:30Z

edgarkz
Mar 6, 2025
Author

Thank you Joe once again for useful input
I've visualized above metrics and it seems we do have up to 20% increase of traces production in system,

I'll add another compactor and see how it goes.

0 replies

Compaction issues - tempo 2.7 - outstanding blocks #4781

Uh oh!

Uh oh!

edgarkz Mar 3, 2025

Replies: 4 comments · 2 replies

Uh oh!

joe-elliott Mar 3, 2025 Maintainer

Uh oh!

Uh oh!

edgarkz Mar 5, 2025 Author

Uh oh!

Uh oh!

joe-elliott Mar 5, 2025 Maintainer

Uh oh!

Uh oh!

milon619 Apr 9, 2025

Uh oh!

joe-elliott Apr 11, 2025 Maintainer

Uh oh!

edgarkz Mar 6, 2025 Author

edgarkz
Mar 3, 2025

Replies: 4 comments 2 replies

joe-elliott
Mar 3, 2025
Maintainer

edgarkz
Mar 5, 2025
Author

joe-elliott
Mar 5, 2025
Maintainer

joe-elliott Apr 11, 2025
Maintainer

edgarkz
Mar 6, 2025
Author