Skip to content

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Aug 21, 2025

Which issue does this PR close?

Closes #1974

Part of #313

Rationale for this change

Rather than fall back for all queries when ANSI mode is enabled, just fall back if the query is using expressions that are affected by ANSI mode.

What changes are included in this PR?

  • Remove COMET_ANSI_MODE_ENABLED config and remove references to that in the Spark diffs since it is no longer required
  • Add expression-specific ANSI fallback logic for all expressions that we have implemented that have an ANSI mode:
    • arithmetic expressions (Add, Subtract, Multiply, Divide, IntegralDivide, and Remainder)
    • aggregate expressions (Avg and Sum)
    • Round
    • Cast
  • Update golden files for Spark 4
    • We now specify spark.comet.expression.allowIncompatible, so that sum and avg run natively. This had the side effect of allowing more operators to run natively than before,

How are these changes tested?

@codecov-commenter
Copy link

codecov-commenter commented Aug 21, 2025

Codecov Report

❌ Patch coverage is 31.81818% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.97%. Comparing base (f09f8af) to head (16099a1).
⚠️ Report is 417 commits behind head on main.

Files with missing lines Patch % Lines
...main/scala/org/apache/comet/serde/aggregates.scala 27.27% 12 Missing and 4 partials ⚠️
...main/scala/org/apache/comet/serde/arithmetic.scala 38.09% 6 Missing and 7 partials ⚠️
...scala/org/apache/comet/expressions/CometCast.scala 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2211      +/-   ##
============================================
+ Coverage     56.12%   57.97%   +1.84%     
- Complexity      976     1275     +299     
============================================
  Files           119      143      +24     
  Lines         11743    13312    +1569     
  Branches       2251     2375     +124     
============================================
+ Hits           6591     7717    +1126     
- Misses         4012     4344     +332     
- Partials       1140     1251     +111     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove changed the title ignore: testing in CI feat: Improved fallback mechanism for ANSI mode [WIP] Aug 21, 2025
@andygrove andygrove changed the title feat: Improved fallback mechanism for ANSI mode [WIP] feat: Improve fallback mechanism for ANSI mode [WIP] Aug 21, 2025
@andygrove andygrove changed the title feat: Improve fallback mechanism for ANSI mode [WIP] feat: Improve fallback mechanism for ANSI mode Aug 22, 2025
@coderfender
Copy link
Contributor

This is great @andygrove. Adding a comment to link PR #2136 which is created to implement ANSI mode for arithmetic operations.

@andygrove andygrove marked this pull request as ready for review August 26, 2025 00:17
@andygrove andygrove marked this pull request as draft August 26, 2025 00:52
@andygrove andygrove marked this pull request as ready for review August 26, 2025 21:36
+ if (!isCometEnabled) {
+ // Comet's error message does not include the original SQL query
+ // https://github.com/apache/datafusion-comet/issues/2215
+ assert(msg.contains(query))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ANSI was not enabled, was this passing?
If so, is there a way to check whether ANSI is enabled and skip only when both comet and ansi are enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were previously falling back to Spark for this test. This Spark test is specifically for testing ANSI errors from cast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall now why we were falling back to Spark so I will take another look

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Spark 3.x we previously fell back to Spark for this test because we do not run the Spark SQL tests for 3.x with ENABLE_COMET_ANSI_MODE=true and therefore did not enable spark.comet.ansi.enabled.

@andygrove andygrove marked this pull request as draft August 27, 2025 00:36
@andygrove andygrove marked this pull request as ready for review August 27, 2025 14:14
Copy link
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove. I think we should have a bigger discussion about defaults and tuning suggestions for ANSI mode.

@mbutrovich mbutrovich merged commit f69739b into apache:main Aug 27, 2025
98 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove COMET_ANSI_MODE_ENABLED config

5 participants