test: stabilize flaky TestSubmitBatchEnqueuesTableModeJob#69657
test: stabilize flaky TestSubmitBatchEnqueuesTableModeJob#69657flaky-claw wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThe test helper ChangesTest environment stabilization
Estimated code review effort: 1 (Trivial) | ~3 minutes Suggested labels: Suggested reviewers: Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: context deadline exceeded" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #69657 +/- ##
================================================
- Coverage 76.3268% 74.4815% -1.8453%
================================================
Files 2041 2060 +19
Lines 560588 580446 +19858
================================================
+ Hits 427879 432325 +4446
- Misses 131808 147228 +15420
+ Partials 901 893 -8
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/ddl/jobsubmit/submit_test.go (1)
72-72: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a comment explaining why the campaign is cancelled.
The intent behind
dom.DDL().OwnerManager().CampaignCancel()(preventing owner-driven scheduler side effects so the test observesSubmitBatch's queued rows) isn't obvious from the call alone.As per coding guidelines: "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees... SHOULD NOT restate what the code already makes clear."
📝 Proposed comment
store, dom := testkit.CreateMockStoreAndDomain(t, mockstore.WithStoreType(mockstore.EmbedUnistore)) + // Cancel the DDL owner campaign so the background scheduler doesn't race with + // SubmitBatch and process jobs before the test can observe the queued rows. dom.DDL().OwnerManager().CampaignCancel()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/ddl/jobsubmit/submit_test.go` at line 72, Add a brief comment near dom.DDL().OwnerManager().CampaignCancel() explaining that the campaign is cancelled to suppress owner-driven scheduler side effects so the test can observe SubmitBatch’s queued rows deterministically. Keep the comment focused on intent and the test invariant, and place it in submit_test.go around the CampaignCancel call so future readers understand why this setup is required.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/ddl/jobsubmit/submit_test.go`:
- Around line 71-72: The test setup in CreateMockStoreAndDomain with
EmbedUnistore still allows the DDL owner loop to start before
OwnerManager().CampaignCancel() runs, so the shutdown is only best-effort.
Update the submit_test setup to use a helper or testkit path that prevents
auto-campaigning/owner startup before the test begins, or otherwise ensure DDL
ownership is disabled before CreateMockStoreAndDomain returns, so the job
submission test runs in full isolation.
---
Nitpick comments:
In `@pkg/ddl/jobsubmit/submit_test.go`:
- Line 72: Add a brief comment near dom.DDL().OwnerManager().CampaignCancel()
explaining that the campaign is cancelled to suppress owner-driven scheduler
side effects so the test can observe SubmitBatch’s queued rows
deterministically. Keep the comment focused on intent and the test invariant,
and place it in submit_test.go around the CampaignCancel call so future readers
understand why this setup is required.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: cbd661c8-62de-44a9-bbe5-e0be62f912f7
📒 Files selected for processing (1)
pkg/ddl/jobsubmit/submit_test.go
| store, dom := testkit.CreateMockStoreAndDomain(t, mockstore.WithStoreType(mockstore.EmbedUnistore)) | ||
| dom.DDL().OwnerManager().CampaignCancel() |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check whether CreateMockStoreAndDomain/bootstrap offers an option to skip owner campaigning
rg -n -B3 -A15 'func CreateMockStoreAndDomain' pkg/testkit/mockstore.go
rg -n -B3 -A20 'func bootstrap\(' pkg/testkit/mockstore.go
rg -n 'DisableAutoCampaign|SkipCampaign|WithoutCampaign' -g '*.go'Repository: pingcap/tidb
Length of output: 3232
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the owner manager cancellation and campaign startup path.
files=(
pkg/ddl/jobsubmit/submit_test.go
pkg/ddl/owner/manager.go
pkg/domain/domain.go
pkg/session/bootstrap.go
)
for f in "${files[@]}"; do
echo "### $f"
wc -l "$f"
done
echo "### submit_test excerpt"
sed -n '1,140p' pkg/ddl/jobsubmit/submit_test.go
echo "### owner manager excerpt"
sed -n '1,260p' pkg/ddl/owner/manager.go
echo "### session bootstrap excerpt"
sed -n '1,240p' pkg/session/bootstrap.go
echo "### domain bootstrap excerpt"
sed -n '1,240p' pkg/domain/domain.goRepository: pingcap/tidb
Length of output: 309
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the actual owner manager file and inspect the test helper + campaign/cancel implementation.
echo "### candidate files"
fd -a 'manager.go' pkg | sed -n '1,40p'
fd -a 'bootstrap.go' pkg | sed -n '1,40p'
fd -a 'submit_test.go' pkg/ddl/jobsubmit | sed -n '1,20p'
echo "### submit_test outline"
ast-grep outline pkg/ddl/jobsubmit/submit_test.go --view expanded | sed -n '1,220p'
mgr="$(fd -a 'manager.go' pkg | grep '/owner/manager.go$' || true)"
if [ -n "$mgr" ]; then
echo "### owner manager outline: $mgr"
ast-grep outline "$mgr" --view expanded | sed -n '1,260p'
echo "### owner manager cancel/campaign snippets"
rg -n -A30 -B8 'func \(.*CampaignCancel|func \(.*Campaign|BreakCampaignLoop|closeSession|campaignLoop|CampaignOwner' "$mgr"
fi
echo "### bootstrap helpers"
for f in pkg/testkit/mockstore.go pkg/session/bootstrap.go pkg/domain/domain.go; do
if [ -f "$f" ]; then
echo "## $f"
rg -n -A20 -B8 'func CreateMockStoreAndDomain|func bootstrap|BootstrapSession|campaign|owner' "$f"
fi
doneRepository: pingcap/tidb
Length of output: 50368
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the mock owner manager path and the DDL owner initialization path for unistore.
echo "### owner mock manager"
rg -n -A80 -B20 'func NewMockManager|type mockManager|CampaignOwner|CampaignCancel|BreakCampaignLoop|IsOwner|Close' pkg/owner
echo "### DDL owner initialization"
rg -n -A80 -B20 'OwnerManager\(\)|NewOwnerManager\(|SetOwnerManager|CampaignOwner\(' pkg/ddl pkg/domain pkg/session | sed -n '1,260p'
echo "### submit_test newSubmitTestEnv"
sed -n '70,90p' pkg/ddl/jobsubmit/submit_test.goRepository: pingcap/tidb
Length of output: 50369
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Read the exact test helper and the DDL owner initialization points that matter here.
sed -n '68,84p' pkg/ddl/jobsubmit/submit_test.go
echo "### DDL owner-related campaign calls"
rg -n -A3 -B3 'CampaignOwner\(\)|CampaignCancel\(\)|OwnerManager\(\)' pkg/ddl pkg/domain pkg/session pkg/testkit/mockstore.go | sed -n '1,240p'
echo "### mock store type branch"
rg -n -A20 -B10 'EmbedUnistore|WithStoreType|CreateMockStoreAndDomain\(' pkg/testkit/mockstore.go pkg/ddl/jobsubmit/submit_test.go | sed -n '1,220p'Repository: pingcap/tidb
Length of output: 25085
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect where the DDL owner campaign is started during bootstrap/session creation.
rg -n -A30 -B12 'EnableDDL\(|DisableDDL\(|CampaignOwner\(\)|CampaignCancel\(\)|StartOwnerManager|NewDDL\(' pkg/session pkg/domain pkg/ddl | sed -n '1,260p'
echo "### bootstrap session location"
rg -n -A120 -B20 'func BootstrapSession|func BootstrapSession4DistExecution|bootstrap\(s sessionapi.Session\)' pkg/session/bootstrap.go | sed -n '1,260p'Repository: pingcap/tidb
Length of output: 20365
Owner shutdown here is still best-effort. CreateMockStoreAndDomain(...EmbedUnistore) already starts the DDL owner loop before this returns, so there’s still a small window where background DDL work can run before CampaignCancel() takes effect. If the test needs full isolation, a helper that skips auto-campaigning would be safer.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/ddl/jobsubmit/submit_test.go` around lines 71 - 72, The test setup in
CreateMockStoreAndDomain with EmbedUnistore still allows the DDL owner loop to
start before OwnerManager().CampaignCancel() runs, so the shutdown is only
best-effort. Update the submit_test setup to use a helper or testkit path that
prevents auto-campaigning/owner startup before the test begins, or otherwise
ensure DDL ownership is disabled before CreateMockStoreAndDomain returns, so the
job submission test runs in full isolation.
|
@flaky-claw: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #69609
Problem Summary:
Flaky test
TestSubmitBatchEnqueuesTableModeJobinpkg/ddl/jobsubmitintermittently fails, so this PR stabilizes that path.What changed and how does it work?
Root Cause
Live mock DDL ownership raced with a queue-inspection test that inserts fabricated DDL jobs.
Fix
Cancelling DDL ownership makes the submit tests observe SubmitBatch’s queued rows instead of scheduler side effects.
Verification
Spec:
pkg/ddl/jobsubmit :: TestSubmitBatchEnqueuesTableModeJobtidb.issue_scoped.v2BASELINE_ONLYGO_TEST_WITH_TAGSintest, deadlockbaseline_onlyObserved result:
Required flaky case was not skipped.
target-flaky passed.
package-failpoint passed.
Gate checklist:
Commands:
go test -json -tags=intest,deadlock ./pkg/ddl/jobsubmit -run '^TestSubmitBatchEnqueuesTableModeJob$' -count=1./tools/check/failpoint-go-test.sh pkg/ddl/jobsubmit -count=1Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Fixes #69609
Summary by CodeRabbit