Skip to content

branch-4.1: [fix](nereids) clamp the merged limit of MERGE_TOP_N by the parent offset #64306#64353

Open
github-actions[bot] wants to merge 1 commit into
branch-4.1from
auto-pick-64306-branch-4.1
Open

branch-4.1: [fix](nereids) clamp the merged limit of MERGE_TOP_N by the parent offset #64306#64353
github-actions[bot] wants to merge 1 commit into
branch-4.1from
auto-pick-64306-branch-4.1

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

Cherry-picked from #64306

…fset (#64306)

`MergeTopNs` (the `MERGE_TOP_N` rewrite rule) merges a parent `TopN`
into its child `TopN`
when their order keys are compatible. When the parent `TopN` carries a
non-zero `OFFSET`, the
merged limit was computed as `min(parent.limit, child.limit)`, which
ignores that the parent
offset consumes rows from the child's output. The merged `TopN`
therefore keeps too many rows
and the query returns a wrong result.

Example:

```sql
SELECT * FROM (SELECT k, v FROM t ORDER BY k LIMIT 5) s ORDER BY k LIMIT 3 OFFSET 4;
```

The inner `ORDER BY k LIMIT 5` yields 5 rows; the outer `LIMIT 3 OFFSET
4` skips 4 of them, so
only 1 row should remain. Before this PR the rule merged the two `TopN`
into `OFFSET 4 LIMIT 3`
(instead of `OFFSET 4 LIMIT 1`), so it returned 3 rows.

Fix: clamp the merged limit by `max(child.limit - parent.offset, 0)`,
the same semantics
already used by `MergeLimits.mergeLimit` for consecutive limits. The bug
only triggers when the
outer `TopN` has a non-zero offset (offset = 0 makes both formulas
equal).

The existing unit test `MergeTopNsTest.testOffset` asserted the buggy
value (`limit == 10`,
while the correct value is `9`); this PR corrects that assertion as
well.

### Release note

Fix the wrong result produced by the `MERGE_TOP_N` optimization when an
outer
`ORDER BY ... LIMIT` carries a non-zero `OFFSET` over an inner `ORDER BY
... LIMIT`.
@github-actions github-actions Bot requested a review from yiguolei as a code owner June 10, 2026 05:11
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen

Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.88% (1/114) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants