Skip to content

[fix](be) Keep workload group paused queries blocked when reserve still fails#62285

Merged
yiguolei merged 2 commits intoapache:masterfrom
mrhhsg:fix_spill_wg_fail
Apr 12, 2026
Merged

[fix](be) Keep workload group paused queries blocked when reserve still fails#62285
yiguolei merged 2 commits intoapache:masterfrom
mrhhsg:fix_spill_wg_fail

Conversation

@mrhhsg
Copy link
Copy Markdown
Member

@mrhhsg mrhhsg commented Apr 9, 2026

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Keep queries paused after WORKLOAD_GROUP_MEMORY_EXCEEDED when retrying the same reserve would still exceed the workload group high watermark, and add a manager unit test that exercises a real reserve failure path.

Release note

None

Check List (For Author)

  • Test: Unit Test
    • Incrementally compiled , , and
    • Full was not completed because linking triggers a large incremental rebuild in this worktree
  • Behavior changed: Yes (queries that still cannot satisfy the same workload group reserve stay paused instead of being resumed immediately)
  • Does this need documentation: No

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg mrhhsg requested a review from Copilot April 9, 2026 09:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates workload-group paused query handling so that queries paused due to WORKLOAD_GROUP_MEMORY_EXCEEDED remain paused when retrying the same reserve would still exceed the workload group high watermark, and adds a unit test that exercises a real reserve failure path.

Changes:

  • Update WorkloadGroupMgr::handle_single_query_ to re-check high watermark including the requested reserve size before resuming a paused query.
  • Extend WorkloadGroup::check_mem_used to optionally account for a prospective reserve size.
  • Add a new unit test that triggers a real try_reserve failure and verifies the query remains paused.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
be/test/runtime/workload_group/workload_group_manager_test.cpp Adds a reserve-failure test and parameterizes helpers to cover the “keep paused” behavior.
be/src/runtime/workload_group/workload_group_manager.cpp Changes resume/keep-paused decision logic for WORKLOAD_GROUP_MEMORY_EXCEEDED using high watermark checks.
be/src/runtime/workload_group/workload_group.h Adds an overload to include reserved bytes in watermark checks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

CountDownLatch latch(1);
size_t check_times = 300;
while (--check_times > 0) {
while (check_times-- > 0) {
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while (check_times-- > 0) causes an unsigned underflow when check_times is 0 (the value wraps even though the loop body won’t run). Consider rewriting to avoid wraparound (e.g., while (check_times > 0) { --check_times; ... } or a for loop) to keep the intent clearer and avoid surprising state changes.

Suggested change
while (check_times-- > 0) {
while (check_times > 0) {
--check_times;

Copilot uses AI. Check for mistakes.
Comment on lines +239 to +287
auto st = thread_context.thread_mem_tracker_mgr->try_reserve(2048);
ASSERT_TRUE(st.ok()) << st.to_string();
ASSERT_EQ(query_context->resource_ctx()->memory_context()->current_memory_bytes(), 2048);
ASSERT_EQ(query_context->resource_ctx()->memory_context()->reserved_consumption(), 2048);
ASSERT_LT(query_context->resource_ctx()->memory_context()->current_memory_bytes(),
query_context->resource_ctx()->memory_context()->mem_limit());
ASSERT_LT(query_context->resource_ctx()->memory_context()->current_memory_bytes() + 1024,
query_context->resource_ctx()->memory_context()->mem_limit());

bool exceed_low_watermark = false;
bool exceed_high_watermark = false;
wg->check_mem_used(&exceed_low_watermark, &exceed_high_watermark);
ASSERT_FALSE(exceed_low_watermark);
ASSERT_FALSE(exceed_high_watermark);

st = thread_context.thread_mem_tracker_mgr->try_reserve(1024);
ASSERT_TRUE(st.is<ErrorCode::WORKLOAD_GROUP_MEMORY_EXCEEDED>()) << st.to_string();
ASSERT_FALSE(st.is<ErrorCode::QUERY_MEMORY_EXCEEDED>()) << st.to_string();
ASSERT_EQ(query_context->resource_ctx()->memory_context()->current_memory_bytes(), 2048);
ASSERT_EQ(query_context->resource_ctx()->memory_context()->reserved_consumption(), 2048);

_wg_manager->add_paused_query(query_context->resource_ctx(), 1024, st);
ASSERT_FALSE(query_context->get_memory_sufficient_dependency()->ready());
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
ASSERT_EQ(_wg_manager->_paused_queries_list[wg].size(), 1)
<< "paused queue should not be empty";
}

_run_checking_loop(wg, 3);

ASSERT_FALSE(query_context->get_memory_sufficient_dependency()->ready());
ASSERT_TRUE(
query_context->resource_ctx()->task_controller()->paused_reason().is<
ErrorCode::WORKLOAD_GROUP_MEMORY_EXCEEDED>());
ASSERT_FALSE(query_context->is_cancelled());
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
ASSERT_EQ(_wg_manager->_paused_queries_list[wg].size(), 1)
<< "paused queue should keep the query";
}

{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
_wg_manager->_paused_queries_list.erase(wg);
}
query_context->set_memory_sufficient(true);
thread_context.thread_mem_tracker_mgr->shrink_reserved();
thread_context.detach_task();
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test mutates _paused_queries_list directly for cleanup and relies on reaching the bottom of the test to call shrink_reserved() / detach_task(). Any earlier ASSERT_* failure will skip this cleanup and can leak paused-query state and thread attachment into subsequent tests, causing order-dependent failures. Prefer a scope guard/RAII cleanup (or a test-only helper API on the manager) so _paused_queries_list cleanup and detach_task() always happen.

Suggested change
auto st = thread_context.thread_mem_tracker_mgr->try_reserve(2048);
ASSERT_TRUE(st.ok()) << st.to_string();
ASSERT_EQ(query_context->resource_ctx()->memory_context()->current_memory_bytes(), 2048);
ASSERT_EQ(query_context->resource_ctx()->memory_context()->reserved_consumption(), 2048);
ASSERT_LT(query_context->resource_ctx()->memory_context()->current_memory_bytes(),
query_context->resource_ctx()->memory_context()->mem_limit());
ASSERT_LT(query_context->resource_ctx()->memory_context()->current_memory_bytes() + 1024,
query_context->resource_ctx()->memory_context()->mem_limit());
bool exceed_low_watermark = false;
bool exceed_high_watermark = false;
wg->check_mem_used(&exceed_low_watermark, &exceed_high_watermark);
ASSERT_FALSE(exceed_low_watermark);
ASSERT_FALSE(exceed_high_watermark);
st = thread_context.thread_mem_tracker_mgr->try_reserve(1024);
ASSERT_TRUE(st.is<ErrorCode::WORKLOAD_GROUP_MEMORY_EXCEEDED>()) << st.to_string();
ASSERT_FALSE(st.is<ErrorCode::QUERY_MEMORY_EXCEEDED>()) << st.to_string();
ASSERT_EQ(query_context->resource_ctx()->memory_context()->current_memory_bytes(), 2048);
ASSERT_EQ(query_context->resource_ctx()->memory_context()->reserved_consumption(), 2048);
_wg_manager->add_paused_query(query_context->resource_ctx(), 1024, st);
ASSERT_FALSE(query_context->get_memory_sufficient_dependency()->ready());
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
ASSERT_EQ(_wg_manager->_paused_queries_list[wg].size(), 1)
<< "paused queue should not be empty";
}
_run_checking_loop(wg, 3);
ASSERT_FALSE(query_context->get_memory_sufficient_dependency()->ready());
ASSERT_TRUE(
query_context->resource_ctx()->task_controller()->paused_reason().is<
ErrorCode::WORKLOAD_GROUP_MEMORY_EXCEEDED>());
ASSERT_FALSE(query_context->is_cancelled());
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
ASSERT_EQ(_wg_manager->_paused_queries_list[wg].size(), 1)
<< "paused queue should keep the query";
}
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
_wg_manager->_paused_queries_list.erase(wg);
}
query_context->set_memory_sufficient(true);
thread_context.thread_mem_tracker_mgr->shrink_reserved();
thread_context.detach_task();
{
using WorkloadGroupPtr = decltype(wg);
using QueryContextPtr = decltype(query_context);
struct ScopedTestCleanup {
WorkloadGroupMgr* wg_manager;
WorkloadGroupPtr workload_group;
QueryContextPtr query_ctx;
ThreadContext& thread_ctx;
~ScopedTestCleanup() {
{
std::unique_lock<std::mutex> lock(wg_manager->_paused_queries_lock);
wg_manager->_paused_queries_list.erase(workload_group);
}
query_ctx->set_memory_sufficient(true);
thread_ctx.thread_mem_tracker_mgr->shrink_reserved();
thread_ctx.detach_task();
}
} cleanup {_wg_manager.get(), wg, query_context, thread_context};
auto st = thread_context.thread_mem_tracker_mgr->try_reserve(2048);
ASSERT_TRUE(st.ok()) << st.to_string();
ASSERT_EQ(query_context->resource_ctx()->memory_context()->current_memory_bytes(), 2048);
ASSERT_EQ(query_context->resource_ctx()->memory_context()->reserved_consumption(), 2048);
ASSERT_LT(query_context->resource_ctx()->memory_context()->current_memory_bytes(),
query_context->resource_ctx()->memory_context()->mem_limit());
ASSERT_LT(query_context->resource_ctx()->memory_context()->current_memory_bytes() + 1024,
query_context->resource_ctx()->memory_context()->mem_limit());
bool exceed_low_watermark = false;
bool exceed_high_watermark = false;
wg->check_mem_used(&exceed_low_watermark, &exceed_high_watermark);
ASSERT_FALSE(exceed_low_watermark);
ASSERT_FALSE(exceed_high_watermark);
st = thread_context.thread_mem_tracker_mgr->try_reserve(1024);
ASSERT_TRUE(st.is<ErrorCode::WORKLOAD_GROUP_MEMORY_EXCEEDED>()) << st.to_string();
ASSERT_FALSE(st.is<ErrorCode::QUERY_MEMORY_EXCEEDED>()) << st.to_string();
ASSERT_EQ(query_context->resource_ctx()->memory_context()->current_memory_bytes(), 2048);
ASSERT_EQ(query_context->resource_ctx()->memory_context()->reserved_consumption(), 2048);
_wg_manager->add_paused_query(query_context->resource_ctx(), 1024, st);
ASSERT_FALSE(query_context->get_memory_sufficient_dependency()->ready());
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
ASSERT_EQ(_wg_manager->_paused_queries_list[wg].size(), 1)
<< "paused queue should not be empty";
}
_run_checking_loop(wg, 3);
ASSERT_FALSE(query_context->get_memory_sufficient_dependency()->ready());
ASSERT_TRUE(
query_context->resource_ctx()->task_controller()->paused_reason().is<
ErrorCode::WORKLOAD_GROUP_MEMORY_EXCEEDED>());
ASSERT_FALSE(query_context->is_cancelled());
{
std::unique_lock<std::mutex> lock(_wg_manager->_paused_queries_lock);
ASSERT_EQ(_wg_manager->_paused_queries_list[wg].size(), 1)
<< "paused queue should keep the query";
}
}

Copilot uses AI. Check for mistakes.
return false;
} else {
Status error_status = Status::MemoryLimitExceeded(
"Query {} workload group memory is exceeded"
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the memory_usage > limit branch, the error is constructed with the message "workload group memory is exceeded", but the branch condition indicates the query/requestor has exceeded its own limit (memory_usage > limit). This makes the cancellation reason misleading for users and logs. Consider updating the message (and/or error type) to reflect query memory limit exceeded (optionally including usage/limit values), reserving the workload-group message for the watermark/limit path.

Suggested change
"Query {} workload group memory is exceeded"
"Query {} memory limit is exceeded"

Copilot uses AI. Check for mistakes.
Comment on lines +717 to +720
bool exceed_low_watermark = false;
bool exceed_high_watermark = false;
wg->check_mem_used(size_to_reserve, &exceed_low_watermark, &exceed_high_watermark);
if (!exceed_high_watermark) {
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description currently contains duplicated template sections and placeholder references (e.g., close #xxx, Related PR: #xxx) which conflict with the earlier stated summary and make the intended linkage/closure ambiguous. Please clean up the PR description so it reflects the actual issue/PR references and the single intended problem statement.

Copilot uses AI. Check for mistakes.
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 9, 2026
### What problem does this PR solve?

Issue Number: None

Related PR: apache#62285

Problem Summary: Address review feedback for the workload group paused-query change by avoiding unsigned wraparound in the test helper loop, making the reserve-failure test cleanup exception-safe, and clarifying the cancel message when the query itself exceeds its memory limit during workload group pressure handling.

### Release note

None

### Check List (For Author)

- Test: No need to test (small follow-up on test cleanup and cancellation message; attempted incremental build but the sandbox hit a ccache temporary-directory permission issue)
- Behavior changed: Yes (the cancellation message now reports query memory limit exceeded instead of workload group memory exceeded in that branch)
- Does this need documentation: No
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented Apr 9, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (17/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.99% (20123/37972)
Line Coverage 36.54% (189171/517651)
Region Coverage 32.81% (146903/447675)
Branch Coverage 33.92% (64260/189459)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (17/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.58% (27362/37187)
Line Coverage 57.23% (295357/516069)
Region Coverage 54.53% (246390/451808)
Branch Coverage 56.11% (106638/190041)

jacktengg
jacktengg previously approved these changes Apr 10, 2026
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 10, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

yiguolei
yiguolei previously approved these changes Apr 10, 2026
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 10, 2026
Issue Number: None

Related PR: apache#62285

Problem Summary: Address review feedback for the workload group paused-query change by avoiding unsigned wraparound in the test helper loop, making the reserve-failure test cleanup exception-safe, and clarifying the cancel message when the query itself exceeds its memory limit during workload group pressure handling.

None

- Test: No need to test (small follow-up on test cleanup and cancellation message; attempted incremental build but the sandbox hit a ccache temporary-directory permission issue)
- Behavior changed: Yes (the cancellation message now reports query memory limit exceeded instead of workload group memory exceeded in that branch)
- Does this need documentation: No
@mrhhsg mrhhsg dismissed stale reviews from yiguolei and jacktengg via f9a62eb April 10, 2026 03:58
@mrhhsg mrhhsg force-pushed the fix_spill_wg_fail branch from e1ecfae to f9a62eb Compare April 10, 2026 03:58
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented Apr 10, 2026

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 10, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (17/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.01% (20130/37974)
Line Coverage 36.56% (189251/517688)
Region Coverage 32.82% (146948/447692)
Branch Coverage 33.92% (64266/189471)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (17/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.68% (27399/37189)
Line Coverage 57.31% (295804/516106)
Region Coverage 54.39% (245749/451825)
Branch Coverage 56.19% (106783/190053)

mrhhsg added 2 commits April 10, 2026 15:42
…ll fails

Issue Number: None

Related PR: None

Problem Summary: Keep queries paused after WORKLOAD_GROUP_MEMORY_EXCEEDED when retrying the same reserve would still exceed the workload group high watermark, and add a manager unit test that exercises a real reserve failure path.

None

- Test: Unit Test
    - Incrementally compiled , , and
    - Full  was not completed because linking  triggers a large incremental rebuild in this worktree
- Behavior changed: Yes (queries that still cannot satisfy the same workload group reserve stay paused instead of being resumed immediately)
- Does this need documentation: No
Issue Number: None

Related PR: apache#62285

Problem Summary: Address review feedback for the workload group paused-query change by avoiding unsigned wraparound in the test helper loop, making the reserve-failure test cleanup exception-safe, and clarifying the cancel message when the query itself exceeds its memory limit during workload group pressure handling.

None

- Test: No need to test (small follow-up on test cleanup and cancellation message; attempted incremental build but the sandbox hit a ccache temporary-directory permission issue)
- Behavior changed: Yes (the cancellation message now reports query memory limit exceeded instead of workload group memory exceeded in that branch)
- Does this need documentation: No
@mrhhsg mrhhsg force-pushed the fix_spill_wg_fail branch from f9a62eb to 52975b5 Compare April 10, 2026 07:48
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented Apr 10, 2026

run buildall

1 similar comment
@mrhhsg
Copy link
Copy Markdown
Member Author

mrhhsg commented Apr 10, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (17/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.02% (20140/37985)
Line Coverage 36.57% (189358/517840)
Region Coverage 32.83% (147014/447820)
Branch Coverage 33.95% (64344/189523)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (17/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.65% (27396/37199)
Line Coverage 57.31% (295890/516256)
Region Coverage 54.44% (246038/451952)
Branch Coverage 56.18% (106809/190105)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@yiguolei yiguolei merged commit 61034e3 into apache:master Apr 12, 2026
29 of 31 checks passed
github-actions bot pushed a commit that referenced this pull request Apr 12, 2026
…ll fails (#62285)

### What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary: Keep queries paused after
WORKLOAD_GROUP_MEMORY_EXCEEDED when retrying the same reserve would
still exceed the workload group high watermark, and add a manager unit
test that exercises a real reserve failure path.

### Release note

None

### Check List (For Author)

- Test: Unit Test
    - Incrementally compiled , , and
- Full was not completed because linking triggers a large incremental
rebuild in this worktree
- Behavior changed: Yes (queries that still cannot satisfy the same
workload group reserve stay paused instead of being resumed immediately)
- Does this need documentation: No

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@mrhhsg mrhhsg deleted the fix_spill_wg_fail branch April 12, 2026 09:14
yiguolei pushed a commit that referenced this pull request Apr 12, 2026
… reserve still fails #62285 (#62407)

Cherry-picked from #62285

Co-authored-by: Jerry Hu <hushenggang@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants