Add tracing support to the compressor#7385
Draft
connortsui20 wants to merge 4 commits intodevelopfrom
Draft
Conversation
Contributor
Author
|
Here is an example of some information we can get from the tracing json output. This is looking at the trace for generating tpch SF1 data, which produces 6837 logs. This is an example of looking at all times the compressor chooses a scheme and the final result ends up being larger than the original array: And we can see that the estimator for FoR is very off for some reason: DetailsAnd then here is how much each scheme saved across all of TPC-H SF1 data. Details |
7f4b5d7 to
3f1411c
Compare
connortsui20
added a commit
that referenced
this pull request
Apr 13, 2026
## Summary Tracking issue: #7216 Makes the compressor types more robust (removes the possibility for invalid state), which additionally sets up adding tracing easier (draft at #7385) ## API Changes Changes some types: ```rust /// Closure type for [`DeferredEstimate::Callback`]. /// /// The compressor calls this with the same arguments it would pass to sampling. The closure must /// resolve directly to a terminal [`EstimateVerdict`]. #[rustfmt::skip] pub type EstimateFn = dyn FnOnce( &CascadingCompressor, &mut ArrayAndStats, CompressorContext, ) -> VortexResult<EstimateVerdict> + Send + Sync; /// The result of a [`Scheme`]'s compression ratio estimation. /// /// This type is returned by [`Scheme::expected_compression_ratio`] to tell the compressor how /// promising this scheme is for a given array without performing any expensive work. /// /// [`CompressionEstimate::Verdict`] means the scheme already knows the terminal answer. /// [`CompressionEstimate::Deferred`] means the compressor must do extra work before the scheme can /// produce a terminal answer. #[derive(Debug)] pub enum CompressionEstimate { /// The scheme already knows the terminal estimation verdict. Verdict(EstimateVerdict), /// The compressor must perform deferred work to resolve the terminal estimation verdict. Deferred(DeferredEstimate), } /// The terminal answer to a compression estimate request. #[derive(Debug)] pub enum EstimateVerdict { /// Do not use this scheme for this array. Skip, /// Always use this scheme, as it is definitively the best choice. /// /// Some examples include constant detection, decimal byte parts, and temporal decomposition. /// /// The compressor will select this scheme immediately without evaluating further candidates. /// Schemes that return `AlwaysUse` must be mutually exclusive per canonical type (enforced by /// [`Scheme::matches`]), otherwise the winner depends silently on registration order. /// /// [`Scheme::matches`]: crate::scheme::Scheme::matches AlwaysUse, /// The estimated compression ratio. This must be greater than `1.0` to be considered by the /// compressor, otherwise it is worse than the canonical encoding. Ratio(f64), } /// Deferred work that can resolve to a terminal [`EstimateVerdict`]. pub enum DeferredEstimate { /// The scheme cannot cheaply estimate its ratio, so the compressor should compress a small /// sample to determine effectiveness. Sample, /// A fallible estimation requiring a custom expensive computation. /// /// Use this only when the scheme needs to perform trial encoding or other costly checks to /// determine its compression ratio. The callback returns an [`EstimateVerdict`] directly, so /// it cannot request more sampling or another deferred callback. Callback(Box<EstimateFn>), } ``` This will make some changes that we want to make is the future easier as well (tracing, better decision making for what things to try, etc). ## Testing Some new tests Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
56bdc36 to
414149e
Compare
robert3005
reviewed
Apr 14, 2026
Contributor
robert3005
left a comment
There was a problem hiding this comment.
I think this is reasonable but will wait for it to not be a draft
Instrument the cascading compressor with composable `tracing` spans and events so users can see what the compressor is doing, compare estimated and actual compression ratios, time individual phases, and surface previously-silent "compressed but the output grew" decisions. Four targets let users select one aspect at a time via `RUST_LOG`: - `vortex_compressor::cascade` — top-level + `compress_child` spans - `vortex_compressor::select` — scheme eligibility, evaluation, winner, and short-circuit reasons - `vortex_compressor::estimate` — sampling span and sample.collected / sample.result events - `vortex_compressor::encode` — per-scheme encode span and the scheme.compress_result event with estimated vs actual ratio + accepted Spans are at `trace` level so `tracing-perfetto` / `tracing-timing` / `tracing-opentelemetry` only materialize them on demand. Events are at `debug` for outcomes so `RUST_LOG=vortex_compressor::encode=debug` produces one readable summary line per leaf. New `tests/tracing.rs` uses a custom capture layer (not `TestWriter`) to pin the names and stable fields of the emitted events so downstream observability tooling does not break under rename. Instrumentation lives entirely in the orchestration layer (compressor.rs + estimate.rs); individual scheme implementations are untouched. The existing unstructured calls in estimate.rs and the stale commented-out line in compressor.rs are removed. A new `# Observability` section in the crate docs carries the full target / span / event reference with `RUST_LOG` recipes. Signed-off-by: Claude <noreply@anthropic.com>
Instrument `BtrBlocksCompressor::compress` with a
`#[tracing::instrument]` on the `vortex_compressor::cascade` target so
downstream trace consumers (tracing-perfetto, tracing-opentelemetry)
get a distinct BtrBlocks entry frame nested above the generic
`CascadingCompressor::compress` pipeline span.
Also delete the stray `tracing::debug!("zigzag output: {}", ...)` line
in `schemes/integer.rs` — it predates the centralized
`scheme.compress_result` event and is now redundant.
Add a short `# Observability` section to the crate docs pointing at
`vortex_compressor`'s full reference, plus one recipe.
Signed-off-by: Claude <noreply@anthropic.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
414149e to
43e333a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tracking issue: #7216
We have very little observability into the compressor. When we are debugging, we don't really have any idea of what schemes the compressor is trying, how good or how bad estimates are, how reliable sampling is, how the cascading paths look, etc.
This change adds tracing support to
vortex-compressor. The compressor now emits structuredtracingspans and events across four composableRUST_LOGtargets (cascade,select,estimate,encode).The
scheme.compress_resultevent is the most important, which reports before/after bytes and estimated vs actual ratio, with a newshort_circuit { reason = "larger_output" }surfacing the previously-silent case where a chosen scheme produced a larger output than the canonical input.All instrumentation lives in the orchestration layer (none of the ~23 individual
Schemeimpls were touched), field names are stable sotracing-perfetto/opentelemetry/timingsubscribers work with no adapter code, and an integration test pins the event names against rename.I still need to figure out how to make this useful when the compressor generates a HUGE amount of logs (for example when it produces logs when generating TPC-H partition files).
Testing
Some basic integration testing for tracing.