-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: prevent xdsIR updates from overwriting RateLimit configs from other xdsIR #3771
fix: prevent xdsIR updates from overwriting RateLimit configs from other xdsIR #3771
Conversation
577b96f
to
b716627
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3771 +/- ##
==========================================
+ Coverage 68.97% 69.10% +0.13%
==========================================
Files 176 176
Lines 21731 21737 +6
==========================================
+ Hits 14989 15022 +33
+ Misses 5662 5628 -34
- Partials 1080 1087 +7 ☔ View full report in Codecov by Sentry. |
…her xdsIR Signed-off-by: Kensei Nakada <[email protected]>
b716627
to
992efb8
Compare
/retest |
thanks for picking this one up @sanposhiho !
|
@arkodg Addressed your points. |
061a071
to
e6fbf2b
Compare
Signed-off-by: Kensei Nakada <[email protected]>
e6fbf2b
to
e27bef7
Compare
Signed-off-by: Kensei Nakada <[email protected]>
e27bef7
to
118e965
Compare
Signed-off-by: Kensei Nakada <[email protected]>
35345de
to
63146e5
Compare
func (r *Runner) subscribeAndTranslate(ctx context.Context) { | ||
// rateLimitConfigsCache is a cache of the rate limit config, which is keyed by the xdsIR key. | ||
rateLimitConfigsCache := map[string][]cachetype.Resource{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to not maintain this cache here and instead create an API like rCache := r.getSnapshot(ctx)
instead ? this will allow us to reduce one copy in memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try; depending on the complexity that would give though, that looks like a better idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change would be like this sanposhiho@a9f0a19
So, either way, we have to have a cache to track which update.Key generates which rate limit configurations. (A cache size would be smaller than the current cache of entire configs, that's one good point though.)
The cache is required because otherwise we wouldn't know which part of snapshot we have to update. (A message.Update
comes from watchable.Snapshot
doesn't have a previous state of ir.)
So, I'm not very motivated to proceed that way; For me, it looks like only a little gain (less memory usage) would come with a certain downside (complexity).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @sanposhiho thanks for writing up the quick diff, what I was suggesting was a little different, instead of holding the entire cache on the stack using rateLimitConfigsCache
which incurs a constant memory hit, can we retrieve the cache snapshot using an API like
https://github.com/envoyproxy/go-control-plane/blob/1da4500d00e270d803caefbe0c20e4d3d162e586/pkg/cache/v3/snapshot.go#L124
every time we get a new new message, and then update it (add or delete key based on type of update) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what's the difference from my diff sanposhiho@a9f0a19? It fetches the previous snapshot with GetSnapshot -> GetResources, and try to rebuild a new snapshot from there already.
My point is that we, either way, need keyToRateLimitCfg
cache to know which ir.Key
has generated which part of snapshot so that we can calculate a correct snapshot after update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that 🙈
lets go ahead with diff to keep the code complexity low and we can revisit these if/when we hit a memory bottleneck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks !
What type of PR is this?
What this PR does / why we need it:
This PR tries to prevent xdsIR updates from overwriting RateLimit configs from other xdsIR.
When there are multiple gateway classes, the gateway API runner updates
r.XdsIR.Store
several times with different keys.gateway/internal/gatewayapi/runner/runner.go
Line 81 in d784c32
But, the ratelimit config runner create a ratelimit config from zero every time it gets updates of XdsIR.
gateway/internal/globalratelimit/runner/runner.go
Lines 114 to 136 in d784c32
As a result, an update to a XdsIR could overwrite rate limit configs generated from other XdsIRs in the ratelimit config runner.
Which issue(s) this PR fixes:
Fixes #3678