-
Notifications
You must be signed in to change notification settings - Fork 7
PCSM-226. Clone data with inconsistent index #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4fddcbf to
32fdfcc
Compare
32fdfcc to
ecb80d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a test case to verify that the clone operation handles data with inconsistent indexes correctly in a sharded MongoDB environment. The test creates a scenario where an index build partially fails due to incompatible document structure (an array value where the text index expects a scalar), resulting in an inconsistent index state.
- Adds a new test file for sharded index scenarios
- Creates a test that intentionally triggers an inconsistent index state by inserting a document with an array field after other documents, then attempting to create a compound text index
- Verifies that the clone phase can handle collections with such inconsistent indexes
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
`IsIndexNotFound` error is occurred.
| try: | ||
| t.source["init_test_db"].text_collection.create_index([("a.b", 1), ("words", "text")]) | ||
| assert False, "Index build should fail due array in doc for text index" | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| try: | |
| t.source["init_test_db"].text_collection.create_index([("a.b", 1), ("words", "text")]) | |
| assert False, "Index build should fail due array in doc for text index" | |
| except Exception: | |
| pass | |
| with pytest.raises(pymongo.errors.OperationFailure, match="text index"): | |
| t.source["init_test_db"].text_collection.create_index([("a.b", 1), ("words", "text")]) |
Something like this ought to work. The assert False is... strange IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it could be the first time seeing it, but it is common to write it like that in our QA e2e tests.
| golangci-lint run | ||
|
|
||
| pcsm-run: build | ||
| ./bin/pcsm --source=$(SOURCE) --target=$(TARGET) --log-level=debug --reset-state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--reset-state is ineffectual here.
It doesn't have an effect unless starting the replication via --start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it does, you run like this pcsm --reset-state you will run PCSM and it will reset the state before starting to accept command like start or something.
It doesn't matter if you pass --start or not, it will simply reset the state (meaning cleanup percona_clustersync_mongodb db that PCSM maintains on the target for it state) on this root pcsm command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it, you're right 👍 Thanks!
Problem
When cloning a sharded collection with an inconsistent index (an index that exists on some shards but not others), PCSM copies the inconsistent index as a regular one. This causes subsequent data copy to fail because documents that prevented index creation on one shard will also fail on the target.
Solution
Detect inconsistent indexes by using
$indexStatsaggregation to count index occurrences across shards. An index is considered inconsistent if it exists on fewer shards than the_id_index. These inconsistent indexes are then skipped during the clone operation. The reason why we compare with_id_rather agains number of shards in the cluster is that a collection doesn't need to be on all the shards all the time, depending on the chunk distribution. Also,_id_can not be inconsistent.We check consistency like this because
checkMetadataConsistencyis not present in mongodb 6.0. Also this is our one-time operation before the clone starts, so it won't impact the performance.Also added to makefile helpers:
make pcsm-start SOURCE="mongodb://src-mongos:27017" TARGET="mongodb://tgt-mongos:29017"- builds PCSM and runs it with--startflag to immediately start the replicationmake pcsm-run SOURCE="mongodb://src-mongos:27017" TARGET="mongodb://tgt-mongos:29017"- builds PCSM and runs it