Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [request]: Systemd upgrade to > v239 for enabling node graceful shutdown #2057

Closed
shivaprasad-balaji opened this issue Jun 27, 2023 · 12 comments
Labels
EKS Managed Nodes EKS Managed Nodes EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@shivaprasad-balaji
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
What do you want us to build?
Upgrade systemd version in AWS optimized EKS AMI to version greater than v239

Which service(s) is this request for?
This is for EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.

We were looking into configuring graceful shutdown for kubernetes nodes. The feature is enabled by default from kubernetes version 1.21.(https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown-beta/ ). Our clusters are running v1.24.

After enabling it and configuring the ShutdownGracePeriod and ShutdownGracePeriodCriticalPods - using the kubelet configuration options, we see that the graceful shutdown is not working as expected. When karpenter(we use karpenter for cluster scaling) detects a node is empty, it terminates the node and the node is terminated immediately without any grace period.

We checked for the issue and we found out few references which indicate there is an issue with the systemd version on the node. We use the AWS EKS optimized linux AMI for the nodes and we see that the systemd version is v219.
As per the links below, it seems this is fixed after v239 of systemd.

  1. GracefulNodeShutdown not work kubernetes/kubernetes#107043 (comment)
  2. feat: add support for shutdownGracePeriod and shutdownGracePeriodCriticalPods kubernetes-sigs/karpenter#248 (comment)

Are you currently working around this issue?
How are you currently solving this problem?
No known workaround is known at this point.

Additional context
Anything else we should know?

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

@shivaprasad-balaji shivaprasad-balaji added the Proposed Community submitted issue label Jun 27, 2023
@mikestef9 mikestef9 added EKS Amazon Elastic Kubernetes Service EKS Managed Nodes EKS Managed Nodes labels Jun 27, 2023
@sayap
Copy link

sayap commented Sep 21, 2023

All these comments about "a known BUG in upstream Systemd" seems to be quite misleading. After spending some time trying to get this to work on Amazon Linux 2, I found the culprit that prevents systemd from doing anything:

$ cat /usr/lib/systemd/logind.conf.d/acpid.conf
# Disable logind's handling of ACPI events when acpid is installed.
[Login]
HandlePowerKey=ignore
HandleSuspendKey=ignore
HandleHibernateKey=ignore
HandleLidSwitch=ignore
HandleLidSwitchDocked=ignore

So we just need a yum remove acpid followed by a systemctl restart systemd-logind, and then systemd-inhibit will work fine on Amazon Linux 2.

I am not sure how well the graceful node shutdown feature works yet, but any issue there shall be blamed on kubelet, not systemd.

@aberle
Copy link

aberle commented Oct 19, 2023

So we just need a yum remove acpid followed by a systemctl restart systemd-logind, and then systemd-inhibit will work fine on Amazon Linux 2.

@sayap I am having trouble getting inhibitors to work with Amazon Linux 2 and found this thread. I was going to try this suggestion, but this package is not installed on my EKS worker node (and my /usr/lib/systemd/logind.conf.d/ directory is empty). When I reboot the instance, it shuts down immediately and does not wait for any inhibitors that are listed with systemd-inhibit --list.

What did you do to troubleshoot getting this to work on Amazon Linux 2 so that I might try out similar steps?

@shivaprasad-balaji
Copy link
Author

I followed https://www.skouf.com/posts/enabling-graceful-node-shutdown-on-eks-in-kubernetes-1-21/ and added the following to the user data of the EC2 instance and was able to get graceful shutdown to work.

mkdir -p /etc/systemd/logind.conf.d
echo "[Login]" > /etc/systemd/logind.conf.d/50-max-delay.conf
echo "InhibitDelayMaxSec=360" >> /etc/systemd/logind.conf.d/50-max-delay.conf
systemctl restart systemd-logind

As @sayap has mentioned, it is probably not related to systemd. I am not a linux expert to understand what component is causing this though.

@aberle
Copy link

aberle commented Oct 19, 2023

I followed those steps as well and graceful shutdown is not working for me so I'm wondering what I'm missing. When I reboot the instance, the inhibitor does not block the shutdown and executes right away.

@sayap
Copy link

sayap commented Oct 20, 2023

@shivaprasad-balaji Indeed we need to follow https://www.skouf.com/posts/enabling-graceful-node-shutdown-on-eks-in-kubernetes-1-21/, to avoid triggering the if-block in https://github.com/kubernetes/kubernetes/blob/v1.23.17/pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go#L182-L202, which would fail with error:

Failed to start node shutdown manager" err="failed reading InhibitDelayMaxUSec property from logind: Message recipient disconnected from message bus without replying"

because ReloadLogindConf would restart systemd-logind and break the connection.

@aberle You are right, the EKS worker node doesn't come with acpid, so my finding above was just a red herring, but it can probably explain why the k8s developer misattributed this as a systemd bug (kubernetes/kubernetes#107043 (comment))

Anyway, if it still doesn't work after following the blog post, can you check:

  • the output of systemd-inhibit --list as root
  • log lines from systemd-logind that says Powering off...
  • log lines from kubelet that starts with Shutdown manager

Note that the configured shutdown grace period is just the upperbound. As soon as kubelet has successfully terminated all the pods, it will remove the inhibitor and allow the system to shutdown.

@sayap
Copy link

sayap commented Nov 7, 2023

Just realize that the sed & grep commands from https://www.skouf.com/posts/enabling-graceful-node-shutdown-on-eks-in-kubernetes-1-21/ doesn't work anymore in newer worker AMIs, due to awslabs/amazon-eks-ami@30ccd211b67. The echo command added by that commit doesn't have double quotes, causing kubelet-config.json to collapse into a single line.

We can replace the sed & grep with a jq one-liner:

echo "$(jq '.shutdownGracePeriod="60s" | .shutdownGracePeriodCriticalPods="20s"' /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json

@youwalther65
Copy link

@shivaprasad-balaji I wonder why we need graceful node shutdown at all for Karpenter. Karpenter Termination Controller is calling the Kubernetes Eviction API and properly drain the node before terminating it via cloud provider. For "Instance Terminating Events" related to Spot instances Karpenter needs to be integrated with build-in Node Termination Handling which ensures a proper drainign as well.

@sayap
Copy link

sayap commented Dec 14, 2023

Out of all the options, I think only graceful node shutdown can gracefully terminate the normal pods first, and then gracefully terminate the daemonset pods.

@shivaprasad-balaji
Copy link
Author

@youwalther65 : As per my understanding and also observation on how it works, karpenter drains all the non-daemonset pods on a node and then terminates the instance. However the node terminates immediately after the cloudprovider API is called and there is no inhibitor or wait, which waits for the daemonset pods to gracefully shutdown.

Setting up the kubernetes graceful shutdown adds inhibitor to the node, so that the daemonsets are given sufficient time to gracefully shutdown.

@youwalther65
Copy link

youwalther65 commented Apr 9, 2024

@shivaprasad-balaji I agree now, I wasn't completely aware of that. But there are still other problems in the kubelet itself where even graceful node shutdown wont help, see Graceful node shutdown doesn't wait for volume teardown #115148.
But coming back to the systemd version. There is a stuck PR which got renamed by replacing 1.29 to 1.30 [WIP] Enable graceful node shutdown for 1.30+ #1544
@sayap My recent EKS optimized 1.29 AMI already run systemd 219. To get the inhibitor lock working on AL2 I successfully followed EBS CSI FAQ here.

@cartermckinnon
Copy link
Member

I think this should be closed. systemd isn't a problem (and it won't be updated on AL2 anyway). We should track this in the AMI's repository: awslabs/amazon-eks-ami#1544

@stevehipwell
Copy link

@mikestef9 @cartermckinnon I'm not sure closing this issue without a replacement (#1651 is currently tagged for MNGs) to track the availability of graceful node shutdown in EKS is in the interest of the community. Could we either get EKS documentation on where/when graceful shutdown is supported, reopen this issue (maybe updated), or open a new issue to track graceful node shutdown?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Managed Nodes EKS Managed Nodes EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests

7 participants