fix: handle case where PD cluster has no leader#538
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @MananShukla7! |
|
Warning Review limit reached
More reviews will be available in 7 minutes and 59 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. 📝 WalkthroughWalkthroughThe PR hardens PD cluster connection logic by converting an unsafe ChangesMissing leader error handling
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
3158fcf to
546d228
Compare
|
/cc @ekexium @andylokandy |
Signed-off-by: MananShukla7 <shuklamanan8@gmail.com>
546d228 to
e21c876
Compare
Problem
In
try_connect_leader, the previous leader is accessed with.unwrap():This panics when the PD cluster temporarily has no leader — a situation that occurs during a rolling restart of PD pods in a Kubernetes environment. During leader election,
GetMembersResponse.leadercan beNone, causing the client to crash rather than returning a recoverable error.Fix
Replace
.unwrap()with.ok_or_else()to return a descriptiveErrinstead of panicking. This is consistent with the style already used in this file.Before:
After:
The same pattern is applied to the
resp.leaderlookup later in the function, replacing the existingok_or_elseerror message with a clearer one.Reproduction
Trigger a rolling restart of PD pods while the TiKV client is connected. The client panics at
previous.leader.as_ref().unwrap()when the newGetMembersResponsearrives withleader: Noneduring leader election.Checklist
cargo fmt --allcargo clippy --all-targets --all-features -- -D warningscargo testgit commit -s)Summary by CodeRabbit