Skip to content

[branch-52] Fix duplicate group keys after hash aggregation spill (#20724) (#20858)#20917

Merged
alamb merged 1 commit intoapache:branch-52from
alamb:alamb/backport_20858_branch-52
Mar 13, 2026
Merged

[branch-52] Fix duplicate group keys after hash aggregation spill (#20724) (#20858)#20917
alamb merged 1 commit intoapache:branch-52from
alamb:alamb/backport_20858_branch-52

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 12, 2026

…pache#20724)

When switching to streaming merge after spill, group_ordering is set to
Full but group_values is not recreated. The existing GroupValuesColumn<false>
uses vectorized_intern which can produce non-monotonic group indices,
violating GroupOrderingFull's assumption and causing duplicate groups
in the output.

Fix: recreate group_values with the correct streaming mode after
updating group_ordering in update_merged_stream().
@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 12, 2026
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb

@alamb alamb merged commit e5547e2 into apache:branch-52 Mar 13, 2026
32 checks passed
@alamb alamb deleted the alamb/backport_20858_branch-52 branch March 13, 2026 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants