-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to read plain encoded INT96 timestamp from Parquet file #456
Conversation
cc: @zhouyuan |
@mskapilks Thank you for helping on this feature! Could you please help to make a pull request to Meta/Velox? We are now doing daily rebase, so if the pull request is merged by Meta/Velox then it will become available in Gluten very soon. CC: @rui-mo is also working on timestamp related enabling. thanks, -yuan |
@zhouyuan This PR builds upon the already work done in oap-project/velox to read dictionary encoded timestamp. PR add additional support to read plain encoded as well. In Meta velox timestamp reading from parquet is not supported yet, so could exactly make a PR on Meta/Velox. |
7e73536
to
976ed57
Compare
Hello @zhouyuan and @rui-mo , |
@mskapilks Thank you for the great work. Seems this PR reuses some existing change on timestamp reader. Let me explain the context here.
As we are lack of bandwidth right now, it will take longer time till we make time to continue this work. If @mskapilks is interested to promote this work in Velox, please feel free to open new PRs. If it is preferred to merge this PR in this repo, it also works for us. cc @zhli1142015 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to add a unit test in ParquetTableScanTest to guard the functionality. And please fix the code format. Thanks.
We are also looking to add INT64 support once this PR get merged. For those I think we can get traction in Meta/Velox. |
Added test. |
Indeed, the format issue is introduced in recent rebase. @zhztheplayer will fix it in the next rebase. thanks, -yuan |
4bab858
to
699ef8a
Compare
8a6ef2b
to
25a0160
Compare
699ef8a
to
3697daa
Compare
@mskapilks it's not related - the URL for hadoop binary changed. will be fixed in the next rebase. |
3697daa
to
e0cdf3d
Compare
e0cdf3d
to
d036eee
Compare
d036eee
to
f084224
Compare
@mskapilks thanks, looks good, I'll pick this in the next rebase |
The failure test DecimalUtilTest.divideWithRoundUp seems unrelated. Will rebase again |
Refactor Add tests
f084224
to
7c95e7b
Compare
@mskapilks Thanks for your continuous efforts on maintaining this PR.I also met CI failure due to |
@rui-mo Thanks for suggestion, updated the PR with fix |
@mskapilks already included this commit in the |
Current only INT96 dictionary encoded timestamp is supported this PR add support for plain encoding