-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xdsclient: introduce pool to manage multiple xDS clients with same bootstrap content #7898
base: master
Are you sure you want to change the base?
xdsclient: introduce pool to manage multiple xDS clients with same bootstrap content #7898
Conversation
202847e
to
b561f1c
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7898 +/- ##
==========================================
+ Coverage 81.84% 81.97% +0.12%
==========================================
Files 377 378 +1
Lines 38120 38216 +96
==========================================
+ Hits 31201 31326 +125
+ Misses 5603 5578 -25
+ Partials 1316 1312 -4
|
23023e9
to
12cb8a7
Compare
12cb8a7
to
8787929
Compare
8787929
to
aa7e498
Compare
var ( | ||
// DefaultPool is the default pool for xds clients. It is created at the | ||
// init time. | ||
DefaultPool *Pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the reason to export this DefaultPool
variable? Was it to make testing easier? I see most reference of this variable is within the xdsclient
package, with only one reference in xds/server_options.go
.
Would it be possible to test via Public APIs instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah server options have to set and get config for testing. I think we can provide the set and get method for default pool but then its equivalent of exporting a variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @danielzhaotongliu. We don't have to make this public. We should change the dial option and the server option to create a pool with the bootstrap config that was passed to it, and use that pool to create the xDS client. This is specified in the proposal in #7592 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just coming back to this comment. I think we would have to export the default pool from the xdsclient
package or use the internal
package trick, because we want the xDS resolver and the xDS server to be able to use the default pool when they want to create xDS clients in non-test environments.
The internal
trick won't work once we move the xDS client out into a separate repo (as part of the generic xDS client work). So, we are only left with the option of exporting this.
Do you both have other ideas?
@@ -46,7 +46,8 @@ const Scheme = "xds" | |||
|
|||
// newBuilderWithConfigForTesting creates a new xds resolver builder using a | |||
// specific xds bootstrap config, so tests can use multiple xds clients in | |||
// different ClientConns at the same time. | |||
// different ClientConns at the same time. It creates the new xds client int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo/nit: "It creates a new xds client in the default pool with the provided config."
xds/internal/xdsclient/pool.go
Outdated
// expected to invoke once they are done using the client. It is safe for the | ||
// caller to invoke this close function multiple times. | ||
func (p *Pool) NewClient(name string) (XDSClient, func(), error) { | ||
config, err := bootstrap.GetConfiguration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the reason to not use the Pool.config
field xDS bootstrap configuration as that was the config passed in/parsed/created during the NewPool
function above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewClient is for the prod path which actually reads the bootstrap config from env vars and will only deal with default pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the use case where users ONLY set bootstrap config from the SetFallbackBootstrapConfig
API and do not set env var? For this case, when creating the xDS client, would it be created via this method? Would they not require p.config
instead since that is the field set from the implementation below (line 133-143)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method must use the config
field inside the Pool
. There is no point in reading the bootstrap configuration from the env vars or the fallback bootstrap configuration. In fact, once we do that, we don't have a reason to have Pool.NewClientForTesting
. Tests could as well use pool.NewClient
.
What do you think?
NewClient is for the prod path which actually reads the bootstrap config from env vars and will only deal with default pool
The bootstrap env vars should be read when the default pool is created at init time. And if bootstrap env vars are not set, the default pool should have a nil
config and if NewClient
is called without calling SetFallbackBootstrapConfig
, i.e. config
is nil
when NewClient
is called, then NewClient
should fail with a useful error message.
xds/internal/xdsclient/pool.go
Outdated
if err != nil { | ||
return nil, nil, fmt.Errorf("xds: failed to get xDS bootstrap config: %v", err) | ||
} | ||
return newRefCounted(name, p, config, defaultWatchExpiryTimeout, defaultIdleChannelExpiryTimeout, backoff.DefaultExponential.Backoff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my own understanding, would there be a need now/or in the future for users or anyone to customize the expiry timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only for testing. We usually provide the testing knobs set this values via methods but only for testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With #7924, we no longer have the idle channel expiry timeout. We only have the watch expiry timeout. And so far, we haven't had any real usecase where a user wants to customize that expiry timeout.
xds/internal/xdsclient/pool.go
Outdated
if opts.Contents != nil { | ||
config, err := bootstrap.NewConfigForTesting(opts.Contents) | ||
if err != nil { | ||
p.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my understanding, where is this p.mu.Lock()
called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good catch. That was supposed to be removed. I removed it.
2ab7ba2
to
6e822a3
Compare
4e62279
to
43b394d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might not be familiar enough with the Github testing process for gRPC Go, have we tested this end-to-end, or would the current tests suffice? I recall there are (PSM) interop tests, are those run post PR submission or manually triggered?
Are there any performance implications of this change?
xds/internal/xdsclient/pool.go
Outdated
// expected to invoke once they are done using the client. It is safe for the | ||
// caller to invoke this close function multiple times. | ||
func (p *Pool) NewClient(name string) (XDSClient, func(), error) { | ||
config, err := bootstrap.GetConfiguration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the use case where users ONLY set bootstrap config from the SetFallbackBootstrapConfig
API and do not set env var? For this case, when creating the xDS client, would it be created via this method? Would they not require p.config
instead since that is the field set from the implementation below (line 133-143)?
xds/internal/xdsclient/pool.go
Outdated
) | ||
|
||
var ( | ||
// DefaultPool is the default pool for xds clients. It is created at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Here and everywhere in this PR, please replace occurrences of xds client(s)
with xDS client(s)
. We've been trying to be consistent with xDS client(s)
recently. So, let's stick to that. Thanks.
xds/internal/xdsclient/pool.go
Outdated
DefaultPool *Pool | ||
) | ||
|
||
// Pool represents pool of xds clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring should mention that the pool of clients share the same bootstrap configuration.
mu sync.Mutex | ||
clients map[string]*clientRefCounted | ||
config *bootstrap.Config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: mu
should ideally only have to guard clients
. But here, we need it to guard config
as well since SetFallbackBootstrapConfig
writes to config
. Maybe a comment here to mention that?
var ( | ||
// DefaultPool is the default pool for xds clients. It is created at the | ||
// init time. | ||
DefaultPool *Pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @danielzhaotongliu. We don't have to make this public. We should change the dial option and the server option to create a pool with the bootstrap config that was passed to it, and use that pool to create the xDS client. This is specified in the proposal in #7592 (comment)
xds/internal/xdsclient/pool.go
Outdated
} | ||
|
||
// NewPool creates a new xds client pool with the given bootstrap contents. | ||
func NewPool(bootstrapContents []byte) (*Pool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nicer to have this accept a *bootstrap.Config
(which could optionally be nil
) rather than having to parse the configuration here. With that change, this API will return a single value, *Pool
, instead of two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we change this to accept a bootstrap.Config
, we should also export bootstrap.newConfigFromContents
. This would allow non-test usages for NewPool
where the caller would use the exported bootstrap.newConfigFromContents
to parse the JSON config and pass it to this function.
@@ -360,7 +360,7 @@ func (s) TestHandleListenerResponseFromManagementServer(t *testing.T) { | |||
// involving receipt of an RDS response from the management server. The test | |||
// verifies that the internal state of the xDS client (parsed resource and | |||
// metadata) matches expectations. | |||
func (s) TestHandleRouteConfigResponseFromManagementServer(t *testing.T) { | |||
func TestHandleRouteConfigResponseFromManagementServer(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the s
receiver being removed from this test?
WatchExpiryTimeout: defaultTestWatchExpiryTimeout, | ||
Contents: bc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should be using a pool to create an xDS client in. In fact, once we delete these NewXxx
functions which don't operate on a pool, the tests will be left with no other choice.
xds/server.go
Outdated
@@ -93,12 +93,11 @@ func NewGRPCServer(opts ...grpc.ServerOption) (*GRPCServer, error) { | |||
// simplifies the code by eliminating the need for a mutex to protect the | |||
// xdsC and xdsClientClose fields. | |||
newXDSClient := newXDSClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having this newXDSClient
var which can be overridden by tests to customize xDS client creation, the xDS server (and possibly the xDS resolver) should contain an xdsClientPool
as a global var which would be set to the default pool at init
time and can be customized by tests.
@@ -121,14 +121,18 @@ func (s) TestNewServer_Success(t *testing.T) { | |||
desc: "without_xds_creds", | |||
serverOpts: []grpc.ServerOption{ | |||
grpc.Creds(insecure.NewCredentials()), | |||
BootstrapContentsForTesting(generateBootstrapContents(t, uuid.NewString(), nonExistentManagementServer)), | |||
func() grpc.ServerOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this func
here?
// matched route configuration. After it transitions back into not ready | ||
// (through an explicit LDS Resource Not Found), previously running RPC's | ||
// should be gracefully closed and still work, and new RPC's should fail. | ||
func (s) TestServingModeChanges_MultipleServers(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to test serving mode changes here. All we need to do here is to ensure that we can create two servers with two different bootstrap configurations. This would also mean that the two servers will request two different LDS resources from the management server, because the LDS resource name depends on the port on which the server is listening.
Then we could have two gRPC clients that talk to each of these servers and we ensure that RPCs work. We need to ensure that a gRPC client is actually talking to the server that it intends to talk to. This can be done with the Peer
call option.
ee9dcb0
to
6d704a9
Compare
6d704a9
to
8cade2a
Compare
4b295c9
to
9588254
Compare
9588254
to
325def2
Compare
Fixes: #7592
RELEASE NOTES: