Fix unconnected Curator client handling when registry check is false#16329
Open
Zhengcy05 wants to merge 2 commits into
Open
Fix unconnected Curator client handling when registry check is false#16329Zhengcy05 wants to merge 2 commits into
Zhengcy05 wants to merge 2 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #16186
What is the purpose of the change?
This PR fixes an issue in the multi-ZooKeeper registry scenario when one registry is unavailable and configured with
check=false.Based on the issue report, in the
dubbo 3.3branch, if multiple ZooKeeper registries are configured and one of them is not started withcheck=false,Curator5ZookeeperClientmay keep an unconnected Curator client alive afterblockUntilConnected(...)returnsfalse. As a result, the following registry workflow still treats that client as usable and continues trying to create nodes on it. This can cause repeated failures in node creation, significantly slow down application startup, and eventually lead to startup failure.From the call path, the problem has two parts:
Curator5ZookeeperClientdoes not fail fast enough on connection failureIf
blockUntilConnected(...)returnsfalse, the ZooKeeper client has not been successfully established. This should always be treated as a client creation failure, and the underlying Curator client should be closed immediately so it cannot be reused later.The
check=falsesemantic should only control whether the upper layer continues to throw when registry creation fails. It should not change whether the low-level ZooKeeper client itself is considered successfully created.ListenerRegistryWrapperis not null-safe when the upper layer returns anullregistryIn
AbstractRegistryFactory#getRegistry(), when registry creation fails andcheck=false, Dubbo logs a warning and returnsnull.However,
RegistryFactoryWrapperstill wraps that result, and methods such asListenerRegistryWrapper#getUrl()previously dereferenced the delegate directly. This could later trigger aNullPointerExceptionin paths likeRegistryDirectory#subscribe().To address this, this PR includes two parts:
Fix the connection failure handling in
Curator5ZookeeperClientblockUntilConnected(...) == false, always close the client first and then throw an exception.checkin this low-level failure path.Add null-safe handling to
ListenerRegistryWrapperURL.getUrl()whenregistry == null.falsefromisAvailable()whenregistry == null.destroy()a safe no-op whenregistry == null.unsubscribe(),isServiceDiscovery(), andlookup().With these changes:
check=falsebehavior at the registry factory layer remains unchanged;nullregistry can still be safely wrapped without introducing extra NPEs.Tests
This PR adds focused regression coverage for both the low-level client behavior and the upper-layer
check=falseflow:Curator5ZookeeperClientTestblockUntilConnected(...)fails, constructingCurator5ZookeeperClientthrowsIllegalStateExceptionfor bothcheck=trueandcheck=false;client.close()is invoked on connection failure.AbstractRegistryFactoryTestcreateRegistry(url)fails andcheck=false,getRegistry(url)returnsnull, preserving the existing upper-layer behavior.ListenerRegistryWrapperTestgetUrl(),isAvailable(),destroy(),lookup(),subscribe(), andunsubscribe()are safe whenregistry == null;null registrycase.RegistryFactoryWrapperTestRegistryFactory#getRegistry(url)returnsnull, the wrapper still returns the original URL safely and allows subscription-related calls without throwing exceptions.Verification command used:
mvn -pl dubbo-remoting/dubbo-remoting-zookeeper-curator5,dubbo-registry/dubbo-registry-api \ -Dtest=Curator5ZookeeperClientTest,ListenerRegistryWrapperTest,RegistryFactoryWrapperTest,AbstractRegistryFactoryTest \ -Dsurefire.failIfNoSpecifiedTests=false test -qChecklist