Skip to content

branch-4.1: [fix](audit) Mark internal query failures as ERR in audit log#62996

Open
yujun777 wants to merge 1 commit intoapache:branch-4.1from
yujun777:pick-pr-62908-branch-4.1
Open

branch-4.1: [fix](audit) Mark internal query failures as ERR in audit log#62996
yujun777 wants to merge 1 commit intoapache:branch-4.1from
yujun777:pick-pr-62908-branch-4.1

Conversation

@yujun777
Copy link
Copy Markdown
Contributor

@yujun777 yujun777 commented May 5, 2026

Proposed changes

Backport PR #62908 to branch-4.1 and adapt the FE unit test for the branch-4.1 test stack.

The test uses the existing JMockit style on branch-4.1 and sets/removes the ConnectContext thread-local so the mocked planner failure is reached reliably.

Test

  • bash ./run-fe-ut.sh --run org.apache.doris.qe.StmtExecutorInternalQueryTest

### What problem does this PR solve?

Issue Number: https://jira.selectdb-in.cc/browse/CIR-20019

Problem Summary:
When an internal query inside StmtExecutor.executeInternalQuery() failed
(for example, the column-statistics gather SQL that ANALYZE issues
against a user table when the underlying tablet hits a BE IO error),
the audit_log entry recorded:

  state=OK | error_code=0 | error_message=<empty> | return_rows=0

This is misleading: the gather query actually failed, but the audit
log makes it look like it succeeded with zero rows. ANALYZE itself
still surfaces the failure to the user, but the per-internal-query
audit entries hide the root cause, complicating triage.

Root cause: executeInternalQuery() wraps the inner work in
try { ... } finally { AuditLogHelper.logAuditLog(context, ...) }, but
the inner catch handlers only re-throw the exception and never set
ConnectContext state to ERR. The default OK state is therefore what
gets logged.

Fix: add an outer catch (Exception e) around the inner try that, when
state has not already been moved to ERR, records ERR_INTERNAL_ERROR
together with the message (falling back to root-cause message when
the exception message is empty), then re-throws so callers behave as
before. The setNereids/setIsQuery/setInternal flags are also moved
above the parse step so audit entries for parse/plan failures still
carry the right metadata.

### Release note

Internal query failures are now correctly recorded as ERR in
audit_log instead of misleadingly showing OK with empty error info.

### Check List (For Author)

- Test:
    - Unit Test: StmtExecutorInternalQueryTest#testExecuteInternalQuerySetsErrorStateOnFailure
    - Regression test: fault_injection_p0/test_audit_log_internal_query_failure
    - Manual test: reproduced on a local cluster with the
      LocalFileReader::read_at_impl.io_error debug point; before the
      fix audit_log shows state=OK / error_message=, after the fix
      it shows state=ERR / error_code=1815 / full IO_ERROR description.
- Behavior changed: Yes (audit_log entries for failed internal
  queries now show ERR; previously they showed OK).
- Does this need documentation: No.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yujun777 yujun777 requested a review from yiguolei as a code owner May 5, 2026 06:31
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777 yujun777 changed the title [fix](audit) Mark internal query failures as ERR in audit log branch-4.1: [fix](audit) Mark internal query failures as ERR in audit log May 5, 2026
@yujun777
Copy link
Copy Markdown
Contributor Author

yujun777 commented May 5, 2026

run buildall

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 6, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR approved by anyone and no changes requested.

@yiguolei
Copy link
Copy Markdown
Contributor

yiguolei commented May 6, 2026

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants