dxf,ddl: fix dxf add index region batch restriction for region continuity#69627
dxf,ddl: fix dxf add index region batch restriction for region continuity#69627YangKeao wants to merge 6 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis PR removes retry-based region scanning and continuity checks from DXF backfill planning, switches both planners to direct region-cache reads with StartKey-based deduplication, and adds a unit test for the shared deduplication helper. ChangesRegion planning simplification
Estimated code review effort: 3 (Moderate) | ~25 minutes Suggested reviewers: Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
pkg/ddl/backfilling_dist_scheduler_internal_test.go (1)
23-47: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winConsider covering the early-return edge cases.
The helper's
len(regionMetas) < 2early-return branch (empty/single-element slices) and the no-duplicates case aren't exercised here; only the consecutive-duplicates path is tested.♻️ Optional additional cases
func TestDeduplicateSortedRegionMetasByStartKey(t *testing.T) { + require.Empty(t, deduplicateSortedRegionMetasByStartKey([]regionMetaForTest{})) + require.Equal(t, []regionMetaForTest{{startKey: []byte("a")}}, + deduplicateSortedRegionMetasByStartKey([]regionMetaForTest{{startKey: []byte("a")}})) + regions := []regionMetaForTest{ {startKey: []byte("a")}, {startKey: []byte("a")}, {startKey: []byte("b")}, {startKey: []byte("b")}, {startKey: []byte("d")}, }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/ddl/backfilling_dist_scheduler_internal_test.go` around lines 23 - 47, The current test for deduplicateSortedRegionMetasByStartKey only covers consecutive duplicates, so add cases for the len(regionMetas) < 2 early return and for an already-unique sorted slice. Extend TestDeduplicateSortedRegionMetasByStartKey in backfilling_dist_scheduler_internal_test.go to verify empty and single-element inputs are returned unchanged, and add a no-duplicates case that confirms the slice is preserved when no deduplication is needed.pkg/ddl/backfilling_dist_scheduler.go (2)
336-386: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoffOptional: extract the shared load/sort/dedup/batch-boundary logic.
generatePlanForPhysicalTableandgenMergeTempPlanForOneIndexnow duplicate the same sequence (load regions → empty check → sort byStartKey→ dedup → computeregionBatch→ derive per-batch[start,end)boundaries). Only the terminalBackfillSubTaskMetaconstruction differs. A helper returning the batch boundary pairs (e.g.[]struct{ start, end []byte }) would remove the duplication while keeping each caller's meta assembly. Deferrable.Also applies to: 875-921
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/ddl/backfilling_dist_scheduler.go` around lines 336 - 386, The region loading, sorting, deduping, batch calculation, and boundary derivation logic is duplicated between generatePlanForPhysicalTable and genMergeTempPlanForOneIndex. Extract that shared flow into a helper that returns per-batch start/end boundaries (or equivalent batch descriptors), then have both callers reuse it and keep only their BackfillSubTaskMeta-specific assembly separate.
336-340: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winThread
ctxinto the backoffer at both call sites.
tikv.NewBackofferWithVars(context.Background(), 20000, nil)detachesLoadRegionsInKeyRangefrom the DDL job’s cancellation path, so a cancelled job can still keep backing off/retrying. Usectxhere and at the matching index path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/ddl/backfilling_dist_scheduler.go` around lines 336 - 340, Thread the DDL context into the backoffer instead of using context.Background() in BackfillingDistScheduler’s region lookup path. Update the LoadRegionsInKeyRange call in the backfilling scheduler to build the backoffer with ctx, and apply the same change at the matching index-path call site so both respect job cancellation and deadlines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/ddl/backfilling_dist_scheduler_internal_test.go`:
- Around line 23-47: The current test for deduplicateSortedRegionMetasByStartKey
only covers consecutive duplicates, so add cases for the len(regionMetas) < 2
early return and for an already-unique sorted slice. Extend
TestDeduplicateSortedRegionMetasByStartKey in
backfilling_dist_scheduler_internal_test.go to verify empty and single-element
inputs are returned unchanged, and add a no-duplicates case that confirms the
slice is preserved when no deduplication is needed.
In `@pkg/ddl/backfilling_dist_scheduler.go`:
- Around line 336-386: The region loading, sorting, deduping, batch calculation,
and boundary derivation logic is duplicated between generatePlanForPhysicalTable
and genMergeTempPlanForOneIndex. Extract that shared flow into a helper that
returns per-batch start/end boundaries (or equivalent batch descriptors), then
have both callers reuse it and keep only their BackfillSubTaskMeta-specific
assembly separate.
- Around line 336-340: Thread the DDL context into the backoffer instead of
using context.Background() in BackfillingDistScheduler’s region lookup path.
Update the LoadRegionsInKeyRange call in the backfilling scheduler to build the
backoffer with ctx, and apply the same change at the matching index-path call
site so both respect job cancellation and deadlines.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d05df9ff-8d40-42ea-87ed-efedf644e2de
📒 Files selected for processing (2)
pkg/ddl/backfilling_dist_scheduler.gopkg/ddl/backfilling_dist_scheduler_internal_test.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #69627 +/- ##
================================================
- Coverage 76.3234% 75.9194% -0.4041%
================================================
Files 2041 2077 +36
Lines 560728 578711 +17983
================================================
+ Hits 427967 439354 +11387
- Misses 131860 137374 +5514
- Partials 901 1983 +1082
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
i think the right way is to find out why region boundary is non continues and make sure the region cache be synced with pd to avoid such issue, instead of tolerating such NON continue region bound, this can potentially be a fatal correctness issue |
|
/hold |
What problem does this PR solve?
Issue Number: close #69626
Problem Summary:
When generating DXF backfill subtasks for
ADD INDEX, TiDB used region metadata to split the table key range into batches and required adjacent regions to be exactly continuous.However, region metadata can change during region merge/split or scheduling.
RegionCache.LoadRegionsInKeyRangecan also load regions in multiple batches, and region layout changes between batches can make the returned region metadata non-continuous or contain duplicate start keys. Because the region information is only used for rough batching, requiring exact region continuity can unnecessarily block DDL execution.What changed and how does it work?
This PR changes DXF backfill range planning to build subtask ranges from sorted region start keys instead of relying on region end-key continuity.
Specifically:
This keeps generated subtask ranges continuous and non-overlapping while avoiding unnecessary DDL failures caused by transient region metadata changes.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit