Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS Addon install missing AWS_DEFAULT_REGION #1476

Open
philnichol opened this issue Oct 15, 2024 · 8 comments
Open

EKS Addon install missing AWS_DEFAULT_REGION #1476

philnichol opened this issue Oct 15, 2024 · 8 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@philnichol
Copy link

/kind bug

Thanks in advance for looking into this, and thanks for maintaining this great project :)

What happened?
When I install the EKS Addon (tested via terraform or AWS console), with deleteAccessPointRootDir = true, IRSA configured, and restrict access to IMDS, when I delete a pvc, I see these errors in my logs, and the PVC never gets deleted

E1015 08:53:11.829540       1 mount_linux.go:231] Mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t efs -o tls,iam fs-XXXXXXXXXXXXXXXXXXX /var/lib/csi/pv/fsap-XXXXXXXXXXXXX                                                                    Output: Error retrieving region. Please set the "region" parameter in the efs-utils configuration file. 

What you expected to happen?
I expect the EKS Addon to work out of the box.

How to reproduce it (as minimally and precisely as possible)?
This assumes you've restricted access to IMDS from your pods (by setting a hop limit). Docs here.

  • Install the efs-csi-driver EKS Addon on a cluster with deleteAccessPointRootDir = true, with an IRSA service account
    image

  • Tail the logs (in a separate terminal) kubectl logs deployment/efs-csi-controller -f -n kube-system

  • Create a storageClass, PVC and pod (dynamic provisioning)

# test.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: test
parameters:
  basePath: /test
  directoryPerms: "775"
  ensureUniqueDirectory: "false"
  fileSystemId: fs-XXXXXXX
  gid: "65534"
  provisioningMode: efs-ap
  subPathPattern: /
  uid: "65534"
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- tls
- iam
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: test
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/instance: test
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: test
    spec:
      containers:
      - image: registry.k8s.io/pause:3.9
        name: test
        resources:
          requests:
            cpu: 20m
            memory: 2Mi
        volumeMounts:
        - mountPath: /test
          name: test
      volumes:
      - name: test
        persistentVolumeClaim:
          claimName: test
  • kubectl apply -f test.yaml
  • kubectl delete -f test.yaml
  • see the logs for efs-csi-controller

Anything else we need to know?:
The reason this happens is because when the driver is installed installed via EKS Addon, the efs-plugin container has the AWS_REGION environment variable set.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: efs-csi-controller
  namespace: kube-system
  resourceVersion: "8596255"
  uid: 09438d06-c1b8-4765-89f6-e696c648d19f
spec:
  template:
    spec:
      containers:
      - name: efs-plugin
        env:
        - name: CSI_ENDPOINT
          value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
        - name: AWS_REGION
          value: ap-southeast-2
        - name: CSI_NODE_NAME

With how IRSA works, if there's already an AWS_REGION variable, it doesn't add the AWS_DEFAULT_REGION variable that the container needs to see what region it's in without calling out to IMDS. At a glance it doesn't look like this would affect people installing via Helm or kustomize.
This should be simple to fix, either:

  • Remove that environment variable from the container, IRSA adds it back anyway, although I guess it could break things for people not using IRSA?
  • Add the AWS_DEFAULT_REGION variable explicitly also.

Could possibly relate to:

Environment

  • Kubernetes version (use kubectl version):
kubectl version
Client Version: v1.30.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.4-eks-a737599 
  • Driver version: v2.0.7-eksbuild.1

Please also attach debug logs to help us better diagnose

  • Instructions to gather debug logs can be found here
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 15, 2024
@rayl15
Copy link

rayl15 commented Oct 28, 2024

/assign

@rayl15
Copy link

rayl15 commented Oct 29, 2024

Hi @philnichol , I was able to reproduce this issue in v2.0.7-eksbuild.1, but it no longer occurs in v2.0.8-eksbuild.1. Should we proceed with marking this issue as resolved, or is there anything else you'd like to add?

@philnichol
Copy link
Author

Hey @rayl15 thanks for looking into this! If it's fixed then I'll mark it as resolved.

@mskanth972
Copy link
Contributor

mskanth972 commented Nov 5, 2024

I am able to reproduce this in v2.0.8 also with Al2023 Node group, I am working on fixing it and let you know once I have the fix out.

@mskanth972
Copy link
Contributor

mskanth972 commented Nov 5, 2024

Hi @philnichol, are you able to resolve this with v2.0.8?

@philnichol
Copy link
Author

@mskanth972 I can see the issue is still present on 2.0.8, I'll reopen this

@philnichol philnichol reopened this Nov 6, 2024
@mskanth972
Copy link
Contributor

I added AWS_DEFAULT_REGION in the controller and can see the root directory being successfully deleted. I will go for releasing this soon, seems there are block days in the EKS Pipeline, I cannot estimate the new release for now but for surely release by end of this month.
Thanks for bringing this here @philnichol and @rayl15 spending time on this.

@philnichol
Copy link
Author

That's great, thanks @mskanth972 for the quick fix and @rayl15 for taking a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants