Skip to content

[fix](fe) Fix PullUpPredicates losing NULL rows in INTERSECT/EXCEPT#62299

Open
yujun777 wants to merge 1 commit intoapache:masterfrom
yujun777:fix/pullup-predicate-null-handling
Open

[fix](fe) Fix PullUpPredicates losing NULL rows in INTERSECT/EXCEPT#62299
yujun777 wants to merge 1 commit intoapache:masterfrom
yujun777:fix/pullup-predicate-null-handling

Conversation

@yujun777
Copy link
Copy Markdown
Contributor

@yujun777 yujun777 commented Apr 9, 2026

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

introduced by #39450

PullUpPredicates.getFiltersFromUnionConstExprs() incorrectly removes NullLiteral from UNION ALL constant expressions without compensating with OR IS NULL, producing invalid pull-up predicates.

For example, SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT NULL generates the pull-up predicate n IN (1, 3), which is NOT true for the NULL row (NULL IN (1, 3) evaluates to NULL, not TRUE).

When InferPredicates pushes this predicate from one INTERSECT child to another, the IN filter incorrectly eliminates NULL rows. Since INTERSECT treats NULL=NULL per SQL standard, this causes incorrect query results with missing NULLs.

Root Cause: In getFiltersFromUnionConstExprs(), line 467:

options.removeIf(option -> option instanceof NullLiteral);

removes NULL but does not add OR IS NULL to compensate.

Fix: When NULL constants are present, generate n IN (...) OR n IS NULL instead of just n IN (...), making the predicate valid for all rows including NULL.

Reproducing Query:

WITH
  tbl0(n) AS (SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT NULL),
  tbl1(n) AS (SELECT 2 UNION ALL SELECT NULL UNION ALL SELECT 1)
(
  SELECT (n * 2) AS n FROM tbl0
  INTERSECT
  SELECT (n * 2) AS n FROM tbl1
);
-- Expected: {2, NULL}  Actual before fix: {2}

Release note

Fixed a bug where INTERSECT and EXCEPT queries could incorrectly drop NULL rows when the optimizer inferred and pushed down predicates from UNION ALL constant expressions.

Check List (For Author)

  • Test: Regression test
  • Behavior changed: No
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:
PullUpPredicates.getFiltersFromUnionConstExprs() incorrectly removes
NullLiteral from UNION ALL constant expressions without compensating
with OR IS NULL, producing invalid pull-up predicates.

For example, SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT NULL
generates the pull-up predicate n IN (1, 3), which is NOT true for
the NULL row (NULL IN (1, 3) evaluates to NULL, not TRUE).

When InferPredicates pushes this predicate from one INTERSECT child
to another, the IN filter incorrectly eliminates NULL rows. Since
INTERSECT treats NULL=NULL per SQL standard, this causes incorrect
query results with missing NULLs.

Fix: when NULL constants are present, generate n IN (...) OR n IS NULL
instead of just n IN (...), making the predicate valid for all rows
including NULL.

### Release note

Fixed a bug where INTERSECT and EXCEPT queries could incorrectly
drop NULL rows when the optimizer inferred and pushed down predicates
from UNION ALL constant expressions.

### Check List (For Author)

- Test: Regression test
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Copy Markdown
Contributor Author

yujun777 commented Apr 9, 2026

run buildall

@yujun777
Copy link
Copy Markdown
Contributor Author

yujun777 commented Apr 9, 2026

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

No issues found in this review.

Critical checkpoints:

  • Goal and correctness: The change addresses the reported unsound pull-up predicate for constant-only UNION ALL NULL rows, and the added regression coverage exercises the reported INTERSECT/EXCEPT behavior plus several NULL variants.
  • Scope/minimality: The code change is small and focused to PullUpPredicates.getFiltersFromUnionConstExprs() with matching regression tests.
  • Concurrency: Not applicable; this FE optimizer rewrite path is single-threaded here and introduces no new shared-state or locking behavior.
  • Lifecycle/static init: Not applicable; no lifecycle or static initialization changes.
  • Configuration: None added.
  • Compatibility: No FE/BE protocol, storage format, or rolling-upgrade compatibility impact.
  • Parallel code paths: I checked the surrounding InferPredicates and set-operation pull-up flow; the new predicate shape is consumed consistently in the affected path.
  • Special conditions: The new NULL handling condition is necessary and matches SQL three-valued logic for valid pulled-up predicates.
  • Test coverage: Regression coverage is good for the reported bug and nearby NULL cases. Residual gap: there is still no dedicated case for a mixed LogicalUnion containing both regular children and constantExprsList, though I did not identify a correctness bug in that path from this patch.
  • Observability: No new observability needed for this optimizer-only fix.
  • Transaction/persistence/data-write concerns: Not applicable.
  • Performance: No material concern; the added predicate construction is trivial and stays off hot execution paths.
  • Other issues: None found.

Overall opinion: acceptable as-is based on the reviewed diff and surrounding optimizer flow.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 50.00% (5/10) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 10, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants