[Bug](profile) move watcher.stop() into locked code block by BiteTheDDDDt · Pull Request #62683 · apache/doris

BiteTheDDDDt · 2026-04-21T13:14:40Z

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #56462

Problem Summary:

Backport of #56462 to this branch.

Dependency::set_ready() previously called _watcher.stop() outside _task_lock. A concurrent is_blocked_by() (which acquires _task_lock, checks _ready, and calls start_watcher()) could re-start the watcher right after stop() but before _ready = true was published; nothing stops it again. As a result watcher_elapse_time() accumulates wall-clock time from the first block until the operator closes, making WaitForDependency / WaitForRuntimeFilter profile counters appear hugely inflated (e.g. ~12s on a 20s query while each individual RF WaitTime is only tens of ms).

Move _watcher.stop() inside the _task_lock block, before setting _ready = true, matching the master fix.

Release note

None

Check List (For Author)

Test:
- No need to test (profile-only fix; identical to merged master change [Bug](profile) move watcher.stop() into locked code block #56462)
Behavior changed: No
Does this need documentation: No

cherry-picked from apache#56462 (master commit 83c7020). Without taking _task_lock, _watcher.stop() in Dependency::set_ready() races with start_watcher() called inside is_blocked_by(): the latter may observe _ready==false (it acquires _task_lock and reads _ready before set_ready() can flip it) and re-start the stopwatch right after set_ready() stopped it. After the race nothing stops the watcher again and watcher_elapse_time() reported in close() reflects the operator's entire lifetime instead of the actual blocked duration. This inflates WaitForDependency[*]Time and WaitForRuntimeFilter counters, which in production were observed to be ~12s while real per-RF wait times were only tens of milliseconds. Fix: stop the watcher inside the _task_lock critical section so it is strictly mutually exclusive with start_watcher(). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

BiteTheDDDDt · 2026-04-21T13:14:49Z

run buildall

Thearas · 2026-04-21T13:14:55Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Backports the fix from #56462 to prevent a race where a dependency watcher can be re-started between stop() and publishing _ready = true, inflating profile wait-time counters.

Changes:

Move _watcher.stop() under _task_lock in Dependency::set_ready().
Ensure watcher stop happens before setting _ready = true while holding the mutex.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

hello-stephen · 2026-04-21T14:55:41Z

BE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	53.01% (19271/36352)
Line Coverage	36.19% (179647/496422)
Region Coverage	32.81% (139459/425110)
Branch Coverage	33.73% (60443/179223)

hello-stephen · 2026-04-21T16:09:03Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	71.35% (25392/35586)
Line Coverage	54.08% (268000/495536)
Region Coverage	51.55% (221362/429386)
Branch Coverage	53.04% (95403/179854)

HappenLee

LGTM

github-actions · 2026-05-05T02:43:36Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-05-05T02:43:39Z

PR approved by anyone and no changes requested.

Copilot AI review requested due to automatic review settings April 21, 2026 13:14

BiteTheDDDDt requested review from morningman and yiguolei as code owners April 21, 2026 13:14

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Copilot started reviewing on behalf of BiteTheDDDDt April 21, 2026 13:37 View session

HappenLee approved these changes May 5, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 5, 2026

github-actions Bot added the reviewed label May 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug](profile) move watcher.stop() into locked code block#62683

[Bug](profile) move watcher.stop() into locked code block#62683
BiteTheDDDDt wants to merge 1 commit intoapache:branch-4.0from
BiteTheDDDDt:fix/dep-watcher-race-4.0

BiteTheDDDDt commented Apr 21, 2026

Uh oh!

BiteTheDDDDt commented Apr 21, 2026

Uh oh!

Thearas commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

hello-stephen commented Apr 21, 2026

Uh oh!

hello-stephen commented Apr 21, 2026

Uh oh!

HappenLee left a comment

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

BiteTheDDDDt commented Apr 21, 2026

What problem does this PR solve?

Release note

Check List (For Author)

Uh oh!

BiteTheDDDDt commented Apr 21, 2026

Uh oh!

Thearas commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

hello-stephen commented Apr 21, 2026

BE UT Coverage Report

Uh oh!

hello-stephen commented Apr 21, 2026

BE Regression && UT Coverage Report

Uh oh!

HappenLee left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants