-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drifted NodeClaims with a TerminationGracePeriod are not considered for disruption #1702
Comments
This issue is currently awaiting triage. If Karpenter contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
For anyone experiencing similar symptoms without TerminationGracePeriod, check to see if you might be encountering a problem with volumemountattachments described in this issue. |
Adding correspondence from slack: Two potential influencing factors here:
I'm guessing you're more impacted by #2 based on the events I see. This is likely a difference from v0.37 since we didn't have TGP and didn't enqueue nodes for deletion that had do-not-disrupt/pdb blocking pods in the first place, making the likelihood that you had indefinitely draining nodes higher. Not to mention that we also now block eviction on do-not-disrupt so the average drain time might be higher than in v0.37 If I had to guess, the best way to fix this would be for us to solve our preferences story (#666), and add a PreferNoSchedule taint for drifted nodes. We discussed adding that taint, but we didn't have a consistent story around how both consolidation/disruption/provisioning could all align so we don't get any flapping issues. |
More correspondence in slack: (of which i know some is included here already)
|
@wmgroot any thoughts here? did you get a chance to validate what I was saying? |
@cnmcavoy identified a bug in Karpenter's logic that tracks nodes marked for deletion. There's an error case which can fail to unmark a marked node, resulting in disruption budgets being reached while no progress can be made. We've got a patch that we've been testing for the last week and plan to open a PR for soon. We think that TGP is not directly related to this problem, but was exacerbating the issue since nodes in a terminating state take up space in the disruption budget while they're pending termination. |
Ultimately the bug was introduced in some of our patched code to address issues with single and multi-node consolidation. We plan to work further with the maintainers on improvements to consolidation to avoid the need to run a forked version of Karpenter. After addressing the bug in our patch, we have seen our disruption frequencies and cluster scale return to pre-v1 levels. We'll re-open or create a new issue if we notice anything else amiss with TGP and drift disruption. |
Description
Observed Behavior:
NodeClaims enter a Drifted state, but fail to become disrupted.
No volumeattachments are involved with this issue.
My node says that disruption is blocked due to a pending pod, but I have no pending pods in my cluster, and the node in question has a taint to allow only a single do-not-disrupt pod to schedule there as a test case.
Expected Behavior:
Nodes with a TerminationGracePeriod set that include do-not-disrupt or PDB-blocked pods are able to be disrupted due to NodeClaim drift and are eventually drained. A new NodeClaim is created immediately once disruption of the old NodeClaim begins.
Reproduction Steps (Please include YAML):
Versions: 1.0.1
kubectl version
): 1.29The text was updated successfully, but these errors were encountered: