Skip to content

Add benchmarks for dictionary path of new_group_values#22004

Open
Rich-T-kid wants to merge 2 commits intoapache:mainfrom
Rich-T-kid:rich-T-kid/dictionary-criterion-benchmarks
Open

Add benchmarks for dictionary path of new_group_values#22004
Rich-T-kid wants to merge 2 commits intoapache:mainfrom
Rich-T-kid:rich-T-kid/dictionary-criterion-benchmarks

Conversation

@Rich-T-kid
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

benchmarks for #21765. Also related to #21860
The goal is to merge this PR and then rebase the branch on #21765 to contain these benchmarks, so that they can be run and compared to the original.

Rationale for this change

Originally this was included in #21765 but that PR is already very large. I decided to move it to its own separate PR

What changes are included in this PR?

Adds benchmarks for the dictionary encoding array path of new_group_values().

Are these changes tested?

n/a

Are there any user-facing changes?

no

@github-actions github-actions Bot added the physical-plan Changes to the physical-plan crate label May 3, 2026

for &size in &SIZES {
let mut cards = CARDS_RELATIVE.to_vec();
cards.push(size); // all-unique stress case
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For size == 8192, this adds 8192 again because CARDS_RELATIVE already has 8 * 1024. Criterion needs unique benchmark IDs, so this benchmark panics before it can run. Please dedupe cards or remove 8 * 1024 from CARDS_RELATIVE.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced 8*1024 with 1000

//! column. Each iteration is timed end-to-end: it constructs the
//! `Box<dyn GroupValues>` returned by `new_group_values`, runs `intern`
//! once (or N times), and then `emit(EmitTo::All)`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says GroupValues construction is timed, but new_group_values is inside the iter_batched_ref setup closure, so Criterion does not measure it, please update the comment to say the measured part is intern + emit.

@Rich-T-kid
Copy link
Copy Markdown
Contributor Author

@kumarUjjawal Just updated it, let me know if theres anything else I should change

@Rich-T-kid
Copy link
Copy Markdown
Contributor Author

@kumarUjjawal could you also take a look at #21860, similar PR. Thank you

@Rich-T-kid
Copy link
Copy Markdown
Contributor Author

@Dandandan this PR should resolve your comments about the lack of benchmarks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants