Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28347: Make a UDAF 'collect_set' work with complex types, even when map-side aggregation is disabled. #5323

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

JeongDaeKim
Copy link

@JeongDaeKim JeongDaeKim commented Jun 26, 2024

What changes were proposed in this pull request?

UDAFs, collect_set and collect_list, won't work with complex types, if map-side aggregation is disabled. This PR will fix this bug.

Why are the changes needed?

When users disable the map-side aggregation feature("hive.map.aggr), collect_set doesn't work with complex types, such as struct.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

No.

How was this patch tested?

Tested with qtest (TestMiniTezCliDriver)

@JeongDaeKim JeongDaeKim changed the title HIVE-28347 Make a UDAF 'collect_set' work with complex types, even when map-side aggregation is disabled. HIVE-28347: Make a UDAF 'collect_set' work with complex types, even when map-side aggregation is disabled. Jun 26, 2024
//no map aggregation.
inputOI = parameters[0];
return ObjectInspectorFactory.getStandardListObjectInspector(
ObjectInspectorUtils.getStandardObjectInspector(inputOI));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge this case with PARTIAL1?

if (m == Mode.PARTIAL1 || m == Mode.COMPLETE) {
  ...

@okumin
Copy link
Contributor

okumin commented Jun 26, 2024

@JeongDaeKim Taking a glance, your finding and fix seems to be very reasonable. Could you please regenerate the result file of udaf_collect_set_2.q?

@JeongDaeKim
Copy link
Author

@okumin Thank you for a quick review! 👍

Copy link

sonarcloud bot commented Jun 27, 2024

Copy link
Contributor

@okumin okumin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done three things.
First, I checked that the evaluator is likely correctly implemented now.
Second, I checked that we have new regrettion tests. Nice work.
Third, I ran the original udaf_collect_set_2.q with set hive.map.aggr = true or false. Both generated the same result set.

Then, the changes here look to me. Note that you also need +1 from a committer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants