Skip to content

fix: coerce UUID to String in readable_metrics to avoid ClassCastException in Spark #13087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kadai0308
Copy link

@kadai0308 kadai0308 commented May 17, 2025

Spark expects all StringType fields to be castable to CharSequence, but Iceberg's
readable_metrics lower_bound/upper_bound may decode to java.util.UUID for UUID-typed
columns. This causes a runtime ClassCastException when Spark tries to read those
metrics as UTF8String.

This commit fixes the issue by converting UUID values to string when generating
readable metric values for Spark metadata tables.

Closes: #13077 (comment)

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the fix @kadai0308 ! can you please add an UT for this as well

@kadai0308
Copy link
Author

thank you for the fix @kadai0308 ! can you please add an UT for this as well

updated! Please help me review the PR. Thx!

@Fokko Fokko self-requested a review May 20, 2025 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot cast java.util.UUID to java.lang.CharSequence
2 participants