Skip to content

[fix](group commit) fix memory leak of pending InsertLoadJob in group commit http stream#62282

Open
sollhui wants to merge 1 commit intoapache:masterfrom
sollhui:fix_job_leak
Open

[fix](group commit) fix memory leak of pending InsertLoadJob in group commit http stream#62282
sollhui wants to merge 1 commit intoapache:masterfrom
sollhui:fix_job_leak

Conversation

@sollhui
Copy link
Copy Markdown
Contributor

@sollhui sollhui commented Apr 9, 2026

fix #62118

Problem

When users use HTTP StreamLoad with group_commit=async_mode, the BE
internally creates group commit batches via GroupCommitTable::_create_group_commit_load,
which sends a streamLoadPut RPC to FE with a SQL like:

INSERT INTO doris_internal_table_id(table_id) WITH LABEL group_commit_xxx
SELECT * FROM group_commit("table_id"="...")

FE processes this via httpStreamPutImpl, which calls initPlan() and
registers an InsertLoadJob in LoadManager with PENDING state.

However, this job never transitions to FINISHED/CANCELLED because:

  1. httpStreamPutImpl only generates the plan and returns it to BE — it
    never calls executeSingleInsert().
  2. The actual data commit is done by BE via loadTxnCommit RPC, which
    bypasses LoadManager entirely.
  3. isExpired() returns false for non-completed jobs, so these PENDING
    jobs are never cleaned up.

This causes a steady memory leak in LoadManager. The issue was introduced
by #56412 and was partially fixed by #56852 (which only handled the case
where ctx.isGroupCommit() == true). The internal group commit RPC does
not set group_commit_mode, so ctx.isGroupCommit() remains false and
the fix in #56852 does not apply.

Fix

In InsertIntoTableCommand.selectInsertExecutorFactory,
detect when the INSERT source contains a group_commit(...) TVF and set
jobId = -1, preventing addLoadJob from being called.

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 9, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@sollhui
Copy link
Copy Markdown
Contributor Author

sollhui commented Apr 9, 2026

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/3) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.x p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] HTTP StreamLoad with GroupCommit in 4.0.3 may cause memory leak.

4 participants