Skip to content

[bugfix] Fix instance partition import rebalance for COMPLETED segments#18635

Open
shauryachats wants to merge 3 commits into
apache:masterfrom
shauryachats:fix/instance-partition-import-rebalance-bug
Open

[bugfix] Fix instance partition import rebalance for COMPLETED segments#18635
shauryachats wants to merge 3 commits into
apache:masterfrom
shauryachats:fix/instance-partition-import-rebalance-bug

Conversation

@shauryachats
Copy link
Copy Markdown
Collaborator

Summary

Tables that import instance partitions from another table via instancePartitionsMap (without a separate instanceAssignmentConfigMap entry for COMPLETED) were broken during rebalance.

TableRebalancer.getInstancePartitionsMap() only loads COMPLETED instance partitions when InstanceAssignmentConfigUtils.shouldRelocateCompletedSegments() returns true - but that method only looked at instanceAssignmentConfigMap and tenant tag overrides, not imported partitions in instancePartitionsMap.

As a result, rebalance passed only CONSUMING instance partitions into segment assignment. Completed segments stayed pinned to one server per stream partition instead of spreading across all servers in each instance partition under ReplicaGroupSegmentAssignmentStrategy (e.g. 16 servers used instead of 64 in the prod layout).
The fix adds a check for InstancePartitionsUtils.hasPreConfiguredInstancePartitions(tableConfig, COMPLETED) so imported COMPLETED partitions trigger relocation the same way as explicitly configured assignment.

Test

RealtimeReplicaGroupSegmentAssignmentTest#testImportedInstancePartitionsWithMultipleServersPerPartition mirrors the TableRebalancer path: it builds the rebalance instancePartitionsMap via shouldRelocateCompletedSegments(tableConfig) rather than always injecting COMPLETED IPs. It asserts the pre-fix behavior when COMPLETED IPs are omitted, and that with the fix all servers receive completed segments (including bootstrap rebalance).

shauryachats and others added 3 commits May 29, 2026 03:08
When a table imports instance partitions from another table via
instancePartitionsMap, shouldRelocateCompletedSegments() only checked
instanceAssignmentConfigMap and missed the pre-configured COMPLETED
instance partitions. This caused COMPLETED segments to never be
relocated during rebalance, leaving them pinned to a single server
per partition instead of being distributed by the
ReplicaGroupSegmentAssignmentStrategy.

The fix adds a check for pre-configured instance partitions
(hasPreConfiguredInstancePartitions) so that imported COMPLETED
instance partitions are properly loaded and used during rebalance.

Co-authored-by: Cursor <cursoragent@cursor.com>
Exercise shouldRelocateCompletedSegments and buildRebalanceInstancePartitionsMap
instead of passing COMPLETED IPs directly to rebalanceTable; assert buggy path
when COMPLETED IPs are omitted.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.45%. Comparing base (aee5a1f) to head (b88c463).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##             master   #18635     +/-   ##
===========================================
  Coverage     64.45%   64.45%             
- Complexity     1137     1282    +145     
===========================================
  Files          3337     3352     +15     
  Lines        206067   207172   +1105     
  Branches      32127    32349    +222     
===========================================
+ Hits         132816   133536    +720     
- Misses        62599    62902    +303     
- Partials      10652    10734     +82     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.45% <100.00%> (+<0.01%) ⬆️
temurin 64.45% <100.00%> (+<0.01%) ⬆️
unittests 64.45% <100.00%> (+<0.01%) ⬆️
unittests1 56.81% <0.00%> (-0.02%) ⬇️
unittests2 37.15% <100.00%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shauryachats shauryachats requested a review from xiangfu0 May 29, 2026 21:02
@shauryachats shauryachats added the bug Something is not working as expected label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something is not working as expected

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants