-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graceful-restart: Prefix not advertised to non-gr peer #2596
Comments
ping |
@fujita - Any chance you could take a look at this one? @rkojedzinszky has done quite a bit of work tracking down this seeming inconsistency with GoBGP's handling of GracefulRestart when GoBGP is configured for GracefulRestart, but the peer is not (see cloudnativelabs/kube-router#1389 for the full history and context). This is causing them quite a bit of a headache as anytime kube-router is restarted they lose all routes until they also restart their peer. |
This is a regression(old version gobgp worked)? |
I dont know, as I was using kube-router only. I suspect a change in kube-router triggered this bug. Earlier, a bug in kube-router masked or changed behavior in a way that it seemed that everything is ok, but it turns out that the applied change in kube-router is correct (cloudnativelabs/kube-router#1327). |
Elaborating just a bit in this thread (obviously all work can be found in the PR that @rkojedzinszky linked)... Previously kube-router used to announce graceful restart for the Some BGP peer implementations were fine ignoring an AFI setup for a family that wasn't used, but other ones it caused breakages unless users configured an AFI group that they weren't using and usually weren't capable of using. However, this also seems to have worked around a bug (?) in GoBGP that @rkojedzinszky documented in the description of this issue up above. |
Hmm, seems that I can't reproduce this. btw, in the sample code, why |
@fujita I have created my sample code based on kube-router behavior. I will need some time to pick this up again, I'll report as soon as I can. |
I could reproduce it again. So use the attached code, and make sure to use the following frr configartion:
Pay attention to the
Howewer, if I make a clear ip bgp on frr, then the sessions gets established again, and then prefixes gets advertised immediately:
This behavior is consistent with cisco ios implemetation too, we had never had issues with frr-cisco setups with gr enabled on one end and disabled on the other. |
I can reproduce the issue. The following change fixed the issue.
|
I created my own example code based on kube-router, where LocalRestarting is also set to true. I think there is a good reason why it is done that way, howewer, I am not really familiar with such depths of bgp/gobgp fsm. Can you please comment on it? |
@rkojedzinszky This was unfortunately way before my time with kube-router. It was introduced with the graceful-restart feature here: https://github.com/cloudnativelabs/kube-router/pull/220/files#diff-d0dae64b5424393b01d606bfcabfa3e8fd82d2c466eaeb28187a4da833105273R504 and no description was given as to why it was added. @fujita Thanks for tracking this down! We really appreciate your time and effort! I found a bit of code documentation about this feature here: https://github.com/osrg/gobgp/blob/master/internal/pkg/config/bgp_configs.go#L4246-L4252 However, I'm still not certain that I understand the implications of setting or not setting it and without understanding it better I feel hesitant to change a setting that has been in the graceful restart implementation since day 1 with kube-router. Is there any chance that you would be willing to clarify what the option does and what the impact to it being set to true or false is? |
@fujita Also please note that in the sample code if you enable IPv6 address family too, and follow the same sequence of starting the speakers, then as commented in the code, a timer will fire soon, and advertisements will begin to be sent. This is with |
should not be set in your case. I can't explain the behavior with IPv6 enabled. Needs to investigate. |
@fujita Thanks again for continuing to pursue this with us. You mentioned that you didn't think that First off, to ensure that I understand what this flag does correctly from your documentation, is it correct to say that it lets it's peers know that it is recovering from a restart by setting the restart bit. And that this is the same procedure that is outlined by RFC4724 here:
If so, then I'm not sure how to proceed with this setting when it comes to kube-router as a whole. You see, kube-router is most frequently run as a pod that can start and stop at any time. Forwarding state is preserved by the fact that kube-router writes out the routing table to the host's network. However, other than that, kube-router does not preserve state across reboots, so it isn't possible for it to reliably know whether it is starting for the first time, or if it is recovering from a previous run where it had previously established graceful-restart enabled BGP sessions. However, in a typical system, the latter is more common than the former. According to the documentation that you linked it says (
So, my understanding of all of this put together then, is:
I would assume that sometime later after route selection is complete, these routes would then come back, but there would be a period of service outage from the routes being withdrawn correct? |
@aauren Thanks, great explanation. We are using this feature of kube-router for inter-node disruption-free upgrades, howewer, for our external connections, we dont want to use gr. This way we expect that if a node goes down for any reason (even just for an upgrade), traffic instantly gets rerouted to another node. |
@fujita can we move forward somehow? |
@fujita ping |
I'm interested in this one too. Currently we lose our BGP routes every time kubespray restarts the kube-router containers |
@rkojedzinszky I understand that you want graceful restarts for BGP sessions between nodes and want it disabled for BGP sessions to your upstream router. In theory, IF you did not need graceful restarts for inter-node BGP sessions, would removing the --bgp-graceful-restart resolve this issue that you're seeing on sessions to the upstream routers? |
Yes, indeed. If GR is disabled in gobgp, everything works fine, prefixes get advertised as expected to upstream routers, in any scenario. |
Seems like the way I tried was bad. Even with v3.26.0, the issue was still there. |
@YutaroHayakawa I will take a look in the next days, thanks! |
Scratching my comment above. Seems like the way I tried was bad. Even with v3.26.0, the issue was still there. |
To pull-in the recent fix (osrg/gobgp#2803) for the issue (osrg/gobgp#2596). Fixes: cilium#32886 Signed-off-by: Yutaro Hayakawa <[email protected]>
To pull-in the recent fix (osrg/gobgp#2803) for the issue (osrg/gobgp#2596). Fixes: cilium#32886 Signed-off-by: Yutaro Hayakawa <[email protected]>
To pull-in the recent fix (osrg/gobgp#2803) for the issue (osrg/gobgp#2596). Fixes: cilium#32886 Signed-off-by: Yutaro Hayakawa <[email protected]>
To pull-in the recent fix (osrg/gobgp#2803) for the issue (osrg/gobgp#2596) Fixes: cilium#32886 Signed-off-by: Yutaro Hayakawa <[email protected]>
To pull-in the recent fix (osrg/gobgp#2803) for the issue (osrg/gobgp#2596) Fixes: cilium#32886 Signed-off-by: Yutaro Hayakawa <[email protected]>
To pull-in the recent fix (osrg/gobgp#2803) for the issue (osrg/gobgp#2596) Fixes: #32886 Signed-off-by: Yutaro Hayakawa <[email protected]>
History: cloudnativelabs/kube-router#1389
I have a gobgp instance configured for graceful restart, and an external peer without graceful restart capability. The symptom is that when gobgp is started, the bgp session gets established, howewer, no prefixes get advertised to the neighbor. If then I restart the peer, prefixes get advertised.
I wrote a sample source which demonstrates this.
gobgp.zip
Note that graceful restart is configured for ipv4 only. Then, when bgp is established, I get the following log entries:
Then, all address families receive EOR (https://github.com/osrg/gobgp/blob/master/pkg/server/server.go#L1699). But when proceeding, it turns out that p.isGracefulRestartEnabled() returns false, thus not sending out updates.
Kube-router until v1.5.1 configured graceful restart for both ipv4 and ipv6. That made the deferral-timeout trigger, and then a different code path had sent out advertisements, howewer only after the timeout.
Once the peer is restarted, then the whole FSM knows about the peer is not configured for graceful restart, and it gets advertisements as soon the session is up.
For the peer a simple frr configuration is enough:
The text was updated successfully, but these errors were encountered: