Skip to content

[branch-53] Fix duplicate group keys after hash aggregation spill (#20724) (#20858)#20918

Open
alamb wants to merge 1 commit intoapache:branch-53from
alamb:alamb/backport_20858_branch-53
Open

[branch-53] Fix duplicate group keys after hash aggregation spill (#20724) (#20858)#20918
alamb wants to merge 1 commit intoapache:branch-53from
alamb:alamb/backport_20858_branch-53

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 12, 2026

…pache#20724)

When switching to streaming merge after spill, group_ordering is set to
Full but group_values is not recreated. The existing GroupValuesColumn<false>
uses vectorized_intern which can produce non-monotonic group indices,
violating GroupOrderingFull's assumption and causing duplicate groups
in the output.

Fix: recreate group_values with the correct streaming mode after
updating group_ordering in update_merged_stream().
@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 12, 2026
@alamb
Copy link
Contributor Author

alamb commented Mar 12, 2026

FYI @comphead

@comphead
Copy link
Contributor

one more to go, I'm starting to think we rebased all the main back to branch-53 :)

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants