Return existing reservation if podRef matches #296

xagent003 · 2023-01-18T21:39:11Z

As mentioned, the Pod has entered a state where the container has died/been killed. There was already an existing IP reservation for this Pod. But for whatever reason, the DEL is missed. But whereabouts should be smart enough to work around this and tie a reservation to podRef and NOT container id.

On restart/post upgrade of our stack, kubelet detects no sandbox container for Pod and tries to start it back up sending an ADD to cni/ipam. In the case of a non-full IP Pool, whereabouts will allocate a NEW IP for this Pod. Now IP Pool has 2 IPs allocated to same PodRef but different container IDs. So we leak IPs.

In the case of a full IP Pool, the ADD fails because no more IPs in range. However as mentioned... there is already an IP rez for same podRef. So why can't it just re-use it?

ip-reconciler does not help in this case as Pod is running and kubelet is stuck in a container retry loop. The graceful cleanup was missed due to container dying out of band and/or kubelet being down.

The only way to get out of this state is by manually deleting the IP reservation in both IPPool and overlappingranges, or, by this fix which returns an existing reservation if PodRef matches.

nicklesimba

Mostly looks good and thanks for the detailed breakdown for the PR description. Just have one question...also will try to bring this up to others for further review.

nicklesimba · 2023-04-06T21:54:08Z

pkg/allocate/allocate.go

+		if reserved[i.String()] != "" {
+			if reserved[i.String()] != podRef {
+				continue
+			}
+			logging.Debugf("Found existing reservation %v with matching podRef %s", i.String(), podRef)


What is the case where the reserved map is nonempty but doesn't contain the podRef? In other words, when will the "continue" get hit in the code?

s1061123

Thank you for the PR. The diff makes sense to me. Could you please add unit-test in allocate_test.go for allocate.go changes, if you don't mind it?

moradiyaashish · 2023-06-07T04:26:18Z

Till fix is available, is there any workaround available to recover from this situation ?

maiqueb · 2023-08-03T14:50:50Z

@xagent003 hey o/

This slipped me by; could you rebase ? Do you need help sorting out the unit tests ?

caribbeantiger · 2023-08-29T15:26:57Z

@xagent003 hey o/

This slipped me by; could you rebase ? Do you need help sorting out the unit tests ?

if @xagent003 doesn't mind , i can open a new merge request with this fix, as we are running into this issue often while testing high availability and node crash scenarios.

@maiqueb is that okay?

maiqueb · 2023-08-29T15:40:47Z

@xagent003 hey o/
This slipped me by; could you rebase ? Do you need help sorting out the unit tests ?

if @xagent003 doesn't mind , i can open a new merge request with this fix, as we are running into this issue often while testing high availability and node crash scenarios.

@maiqueb is that okay?

Please do. We'll review it. Please take into account @s1061123 's request in #296 (review)

Thanks for offering !

Ritwik037 · 2023-08-31T20:29:23Z

@dougbtv do you have any plan merging this PR because we are facing this issue with version 0.6.2 also and @xagent003 can you please confirm in which whereabouts version you made this changes and is it compatible with kubernetes 1.25

Ritwik037 · 2023-08-31T20:32:13Z

@maiqueb we are facing the issue that during the rolling upgrade of the kubernetes whenever a pods shift to a new worker node it got stuck the whereabout is not able to assign the ips to we tried the latest version 0.6.2 also but it didn't make any difference

xagent003 requested a review from dougbtv as a code owner January 18, 2023 21:39

Return existing reservation if podRef matches

31583e5

xagent003 force-pushed the private/arjun/matchPodRefAdd branch from cc1fa6a to 31583e5 Compare January 18, 2023 21:54

nicklesimba reviewed Apr 6, 2023

View reviewed changes

s1061123 reviewed Apr 7, 2023

View reviewed changes

caribbeantiger mentioned this pull request Sep 8, 2023

Return existing reservation if podRef matched - rebase #383

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return existing reservation if podRef matches #296

Return existing reservation if podRef matches #296

xagent003 commented Jan 18, 2023 •

edited

Loading

nicklesimba left a comment

nicklesimba Apr 6, 2023

s1061123 left a comment

moradiyaashish commented Jun 7, 2023

maiqueb commented Aug 3, 2023

caribbeantiger commented Aug 29, 2023

maiqueb commented Aug 29, 2023

Ritwik037 commented Aug 31, 2023

Ritwik037 commented Aug 31, 2023

Return existing reservation if podRef matches #296

Are you sure you want to change the base?

Return existing reservation if podRef matches #296

Conversation

xagent003 commented Jan 18, 2023 • edited Loading

nicklesimba left a comment

Choose a reason for hiding this comment

nicklesimba Apr 6, 2023

Choose a reason for hiding this comment

s1061123 left a comment

Choose a reason for hiding this comment

moradiyaashish commented Jun 7, 2023

maiqueb commented Aug 3, 2023

caribbeantiger commented Aug 29, 2023

maiqueb commented Aug 29, 2023

Ritwik037 commented Aug 31, 2023

Ritwik037 commented Aug 31, 2023

xagent003 commented Jan 18, 2023 •

edited

Loading