-
Notifications
You must be signed in to change notification settings - Fork 246
feat: implement_ansi_eval_mode_arithmetic #2136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement_ansi_eval_mode_arithmetic #2136
Conversation
Added test cases for addition , subtraction and multiplication use cases. Given that spark inherently converts decimal operands to doubles (and decimals in case of integral division in comet) , I am yet to make code changes needed to test out those use cases |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2136 +/- ##
============================================
+ Coverage 56.12% 58.48% +2.36%
- Complexity 976 1440 +464
============================================
Files 119 146 +27
Lines 11743 13505 +1762
Branches 2251 2347 +96
============================================
+ Hits 6591 7899 +1308
- Misses 4012 4375 +363
- Partials 1140 1231 +91 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@andygrove , @kazuyukitanimura
I am inclined towards #1 given the effort to reward ratio |
@coderfender I agree. Do you plan to change |
After some debugging early exception raising (from datafusion custom ANSI kernel) seems to be the issue . Below is the exact test which is causing a failure in Spark4.0
|
@kazuyukitanimura , @andygrove I updated the exception / message to be exactly matching with Spark . However, it seems like the issue is with
Would love to know your thoughts and also catch up if I am missing something. Code to reproduce above error :
|
Investigating into new test failures |
@mbutrovich I seem to recall that you had a solution for this kind of test failure |
@parthchandra , thank you for the review. Failing test logs : https://github.com/apache/datafusion-comet/actions/runs/17112443091/job/48536768203?pr=2136 Thank you |
922d7fc
to
1022595
Compare
I was able to comb through the failed tests and fixed the failing int check case match statement |
Summary of changes Native Side :
Scala side:
cc : @andygrove , @kazuyukitanimura |
To clarify this a bit more, Spark performs lazy evaluation on the inputs to |
I filed #2231 to track this on Comet side |
d3ee468
to
a740ee3
Compare
Rebased with main branch after lazy evaluation changes (Coalesce) are merged |
4d825d8
to
58a1bee
Compare
Resolved issues with failing tests caused by incorrect diff file generation . |
fa016e3
to
1eca257
Compare
@andygrove , @parthchandra I rebased the feature branch with main branch and CI seems to be passing . Please review the changes once you get a chance |
} | ||
|
||
test("ANSI support for divide (division by zero)") { | ||
// TODO : Support ANSI mode in Integral divide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this TODO still need to be addressed? Perhaps we need a link to another issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review. I already have changes ready for for integral divide #2421 which should be ready for review once this PR is merged to main . Given that this PR has been up for quite sometime now, I didnt want to add in more changes to keep the focus scoped down to simple arithmetic functions only.
if eval_mode == EvalMode::Try && data_type.is_integer() { | ||
if [EvalMode::Try, EvalMode::Ansi].contains(&eval_mode) | ||
&& (data_type.is_integer() | ||
|| (data_type.is_floating() && op == DataFusionOperator::Divide)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the float/divide case covered by existing tests, or should a new test be added for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comment. All division in comet is float or decimal division . I have had to add this check to make sure that float operands are supported only when the operation is division and the existing tests should cover this . I also have a draft PR ready for supporting Integral division #2421 which should be the ready for review once this is merged :)
1d072c0
to
9603223
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending CI. Thanks for your patience @coderfender
Thank you @andygrove , CI passed and I will be raising a PR for Integral DIvide ANSI implementation to build off of this |
Which issue does this PR close?
Support ANSI mode for arithmetic operations in Spark #2137
Closes #2137.
Rationale for this change
Introduce ANSI support for arithmetic operations (for integer/ float types).
Now that try eval mode is supported with comet,this PR essentially tracks changes to take it a step further and support ANSI mode.
In order to achieve that, following changes are made :
QueryPlanSerde
is now updated to not fallback to Spark when EVAL mode is ANSI (for arithmetic operations only)planner.rs
is now going to respect EVAL mode when creating a binary arithmetic functionchecked_arithmetic.rs
now raises a spark equivalent exception on overflow / division by 0 if ANSI mode is enabledWhat changes are included in this PR?
How are these changes tested?
Note that these changes do not introduce support for Integral Divide and Remainder expressions. I will be filing issues to track them separately and work on them once these changes are merged
-->