Description
try_connect_leader in src/pd/cluster.rs calls .unwrap() on previous.leader without guarding against the None case:
let previous_leader = previous.leader.as_ref().unwrap();
This causes the client to panic during a rolling restart of PD pods in a Kubernetes environment, when GetMembersResponse.leader is temporarily None while a new leader is being elected. The panic is unrecoverable and takes down the client process entirely.
Steps to reproduce
- Run a TiKV cluster on Kubernetes with multiple PD pods
- Connect the Rust client while the cluster is healthy
- Trigger a rolling restart of PD pods (
kubectl rollout restart)
- During the leader election window, the client receives a
GetMembersResponse with leader: None
- Client panics at
previous.leader.as_ref().unwrap()
Panic output
thread 'tokio-runtime-worker' panicked at 'called `Option::unwrap()` on a `None` value'
src/pd/cluster.rs:310
Expected behaviour
The client should return a recoverable Err and allow the caller to retry, rather than panicking and crashing the process.
Suggested fix
Replace .unwrap() with .ok_or_else(), consistent with the style already used elsewhere in this file:
let previous_leader = previous
.leader
.as_ref()
.ok_or_else(|| internal_err!("PD cluster has no leader"))?;
A fix is available in PR #538.
Environment
- Client:
tikv-client (master)
- Deployment: Kubernetes with multiple PD pods
- Trigger: PD pod rolling restart / leader election
Description
try_connect_leaderinsrc/pd/cluster.rscalls.unwrap()onprevious.leaderwithout guarding against theNonecase:This causes the client to panic during a rolling restart of PD pods in a Kubernetes environment, when
GetMembersResponse.leaderis temporarilyNonewhile a new leader is being elected. The panic is unrecoverable and takes down the client process entirely.Steps to reproduce
kubectl rollout restart)GetMembersResponsewithleader: Noneprevious.leader.as_ref().unwrap()Panic output
Expected behaviour
The client should return a recoverable
Errand allow the caller to retry, rather than panicking and crashing the process.Suggested fix
Replace
.unwrap()with.ok_or_else(), consistent with the style already used elsewhere in this file:A fix is available in PR #538.
Environment
tikv-client(master)