-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EKS: Cluster Deletion Fails #32395
Comments
@hakenmt Good afternoon. Thanks for opening the issue. Could you please confirm if the issue is reproducible using the latest version of CDK? Also, could you also check permissions of Thanks, |
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled. |
It looks like the cluster custom resource has an explicit depends on for the policy:
So it doesn't look like this should fail (but obviously did). I can't reproduce easily. However, it looks like the policy and the specified resource don't match. From the error log:
Where the resource in the policy statement is:
And the custom resource is defined as:
So it's unclear if the construct is building the resources using the cluster name provided, but the onEvent function uses a different resource name? I've deployed this resource/stack hundreds of time using the same templates, so also unclear why this would happen. |
CC @pahud for visibility |
Hi Yes this is possible and very similar to #31032 The Cluster resource in aws-eks is currently implemented using CustomResource, when you fail to update a property the rollback, in some edge cases, might fail. We will continue investigate this. I am not sure if we are able to get it fixed as the team is working on a new aws-eks-alpha module. |
Describe the bug
A CFN stack containing an EKS cluster failed and attempted to roll back. The OnEventHandler custom resource that's responsible for handling cluster deletion failed to delete the resource with permissions error. From the CW logs:
However, this doesn't happen all of the time, and am wondering if there is a hidden race condition during a stack rollback where the permissions policy may get deleted before the function and assigned role are deleted?
Regression Issue
Last Known Working CDK Version
No response
Expected Behavior
I would expect the automatically created custom resource and IAM role to have the appropriate permissions.
Current Behavior
Sometimes, cluster deletion fails with a 403 error.
Reproduction Steps
I don't have specific reproduction steps since the behavior is transient. This is basically the cluster resource definition:
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.164.1
Framework Version
No response
Node.js Version
20
OS
darwin
Language
.NET
Language Version
No response
Other information
No response
The text was updated successfully, but these errors were encountered: