How does tempo uses parquet on object storage now? #2128
-
Hello, I'm really curious about how this works, especially, how it uses Parquet with object storage. Q1. Does Tempo query trace data directly from the object storage? Q2. If tempo uses local storage. Do I need to allocate large amount of storage in production? Q3. Does Compactor remove old locally downloaded files periodically? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Is there a document that describes whole Tempo architecture? |
Beta Was this translation helpful? Give feedback.
-
Tempo only pulls the columns it needs when executing a query from object storage. It does not download entire blocks locally when querying. It also has to pull the Parquet footer for each block so it can know where each column exists.
Only the ingesters require local storage. Upon receiving data they write it to a set of local parquet files that they then flush to the backend.
Compactors take multiple input blocks from object storage and combine them together to create one output block. This serves 2 purposes:
These are mostly up to date. They won't answer all of your questions but they cover the high level: https://grafana.com/docs/tempo/latest/operations/
Awesome :)! Please keep asking questions. I'm happy to answer. |
Beta Was this translation helpful? Give feedback.
Tempo only pulls the columns it needs when executing a query from object storage. It does not download entire blocks locally when querying. It also has to pull the Parquet footer for each block so it can know where each column exists.
Only the ingesters require local storage. Upon receiving data they write it to a set of local parquet files that they then flush to the backend.
Compactors take multiple input blocks from object storage and combine them togethe…