Block size vs data.parquet size #5301
Replies: 3 comments 2 replies
-
I would definitely recommend reducing this value. We used to create blocks this size with the v2 format, but parquet is far too costly to build blocks this large. Here are our settings:
how many? 150 MB/s * 3600s * 24h * 30d = 388800000 MB of retention 388800000 / 1000 / 1000 -> 388 TB? 120TB sounds like a deal to me :). Is my math wrong? You are at a scale where Tempo can be difficult to operate. Feel free to ask Qs. Tempo 3.0 will include an RF1/queue based rearchitecture that should make your situation much better. |
Beta Was this translation helpful? Give feedback.
-
also need some guidance on the compactor config, what we have:
will this work for the case? |
Beta Was this translation helpful? Give feedback.
-
Can you check the ingester max_block_bytes setting? It defaults to 500MB, but this seems like it is configured for 100MB, maybe from an example or default somewhere else? Flushing 100MB blocks is too small for that volume, and it is putting too much pressure on polling and compaction to clean up. Flushing larger blocks from the ingester is the most effective way to reduce work since it is upstream of those components. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Can you please help me understand how the storage for tempo is working?
I'm getting around 150MB/s up to 200MB/s of traces.
Have block_retention set to 720h
Currently using around 120TB of s3 storage.
Compactor max_block_bytes is set to default 100GB.
When i'm looking on the s3 contents, the content of the block dir is more or less 100MB and there are so many blocks in the bucket that I cant even list them in reasonable time.
Is this expected? or is some other setting affecting compacted blocks size?
Just looked at the logs and one of them is saying:
level=info ts=2025-06-18T08:22:53.731612612Z caller=compactor.go:277 msg="wrote compacted block" version=vParquet4 tenantID=prd blockID=d08298c3-3d5e-461d-a549-18e87b3b3653 startTime="2025-06-18 07:22:41 +0000 UTC" endTime="2025-06-18 07:27:14 +0000 UTC" totalObjects=373444 size=451379070 compactionLevel=1 encoding=none totalRecords=373444 bloomShardCount=5 footerSize=84639 replicationFactor=0 dedicatedColumns=[]
430MB is still way less than 100GB
Beta Was this translation helpful? Give feedback.
All reactions