Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] v0.24.3+ cannot reference extra files for metadata server auth #1083

Open
0x1a8510f2 opened this issue Aug 10, 2024 · 5 comments
Open
Assignees
Labels
kind/bug Something isn't working

Comments

@0x1a8510f2
Copy link

What happened:

After upgrading to v0.24.5 and restarting my pods, they all failed to start with the following error:

MountVolume.SetUp failed for volume "pvc-6cad2b80-d7b8-4250-b61e-6998b756c96a" : rpc error: code = Internal desc = Could not mount juicefs: 2024/08/10 00:33:50.422388 juicefs[139] <INFO>: Meta address: rediss://default:****@10.250.5.130:6380/1?tls-cert-file=/tls/client.crt&tls-key-file=/tls/client.key&tls-ca-cert-file=/tls/ca.crt [interface.go:504] 2024/08/10 00:33:50.422927 juicefs[139] <FATAL>: Meta rediss://default:****@10.250.5.130:6380/1?tls-cert-file=/tls/client.crt&tls-key-file=/tls/client.key&tls-ca-cert-file=/tls/ca.crt is not available: get certificate error certFile:/tls/client.crt keyFile:/tls/client.key error:open /tls/client.crt: no such file or directory [interface.go:516] : exit status 1

I had followed https://juicefs.com/docs/csi/guide/pv/#mount-pod-extra-files to include TLS certs in the mount containers for auth with a Redis metadata server, and this worked fine on v0.24.2 which I was on previously. Downgrading to v0.24.3 did not fix the issue, but downgrading fully back to v0.24.2 did.

What you expected to happen:

Pods should continue to be able to mount volumes after upgrading.

How to reproduce it (as minimally and precisely as possible):

Set up a storage class pointing at a Redis metadata server with TLS. Use aforementioned guide to mount TLS certs in mount container. Notice that this works on 0.24.2 but not later versions.

Anything else we need to know?

Environment:

  • JuiceFS CSI Driver version (which image tag did your CSI Driver use): v0.24.2 -> v0.24.5
  • Kubernetes version (e.g. kubectl version): v1.31.0-rc.1
  • Object storage (cloud provider and region): SFTP
  • Metadata engine info (version, cloud provider managed or self maintained): KeyDB self-hosted
  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage): LAN
  • Others: N/A
@0x1a8510f2 0x1a8510f2 added the kind/bug Something isn't working label Aug 10, 2024
@0x1a8510f2
Copy link
Author

FWIW here is the terraform I'm using for the storage class etc:

resource "kubernetes_secret" "jfs-secret-storage-one" {
  metadata {
    name      = "jfs-secret-storage-one"
    namespace = "default"
    labels = {
      "juicefs.com/validate-secret" : "true"
    }
  }
  type = "Opaque"
  data = {
    "configs" : "{${kubernetes_secret.jfs-secret-storage-one-tls.metadata[0].name}: /tls}"
    "name" : "storage-one"
    "metaurl" : "rediss://default:*****@10.250.5.130:6380/1?tls-cert-file=/tls/client.crt&tls-key-file=/tls/client.key&tls-ca-cert-file=/tls/ca.crt"
    "storage" : "sftp"
    "bucket" : "10.250.5.130:juicefs/"
    "access-key" : "juicefs"
    "secret-key" : "*******"
  }
}

resource "kubernetes_secret" "jfs-secret-storage-one-tls" {
  metadata {
    name      = "jfs-secret-storage-one-tls"
    namespace = "kube-system"
  }
  type = "Opaque"
  binary_data = {
    "ca.crt" : "${filebase64("${path.module}/../storage/slashkeydb/tls/ca.crt")}"
    "client.key" : "${filebase64("${path.module}/../storage/slashkeydb/tls/client.key")}"
    "client.crt" : "${filebase64("${path.module}/../storage/slashkeydb/tls/client.crt")}"
  }
}

resource "kubernetes_storage_class" "jfs-storage-one" {
  metadata {
    name = "jfs-storage-one"
  }
  storage_provisioner = "csi.juicefs.com"
  reclaim_policy      = "Retain"
  parameters = {
    "csi.storage.k8s.io/provisioner-secret-name" : kubernetes_secret.jfs-secret-storage-one.metadata[0].name
    "csi.storage.k8s.io/provisioner-secret-namespace" : kubernetes_secret.jfs-secret-storage-one.metadata[0].namespace
    "csi.storage.k8s.io/node-publish-secret-name" : kubernetes_secret.jfs-secret-storage-one.metadata[0].name
    "csi.storage.k8s.io/node-publish-secret-namespace" : kubernetes_secret.jfs-secret-storage-one.metadata[0].namespace
    "pathPattern" : "$${.pvc.namespace}.$${.pvc.name}"
  }
}

@zxh326 zxh326 self-assigned this Aug 12, 2024
@zxh326
Copy link
Member

zxh326 commented Aug 12, 2024

I did not reproduce on v0.24.5

Can you confirm whether mountpod contains the secret volume jfs-secret-storage-one-tls

@0x1a8510f2
Copy link
Author

0x1a8510f2 commented Aug 18, 2024

Hi @zxh326, sorry about the delay.

It seems like everything is attached as it's supposed to be. Here are some Lens screenshots post-upgrade. Curiously, it seems like the error originates in the juicefs-csi-node-h5mq4 pod even though it seems like juicefs-talos-4hl-o41-pvc-a03811a9-0a10-4d72-a801-21f29bebd6f0-poykub is the one mounting the certificates (in /tls) and the latter seems to connect fine too per the logs.

Screenshots

image
image
image
image
image

@zxh326
Copy link
Member

zxh326 commented Aug 19, 2024

Hi @zxh326, sorry about the delay.

It seems like everything is attached as it's supposed to be. Here are some Lens screenshots post-upgrade. Curiously, it seems like the error originates in the juicefs-csi-node-h5mq4 pod even though it seems like juicefs-talos-4hl-o41-pvc-a03811a9-0a10-4d72-a801-21f29bebd6f0-poykub is the one mounting the certificates (in /tls) and the latter seems to connect fine too per the logs.

Screenshots

csi-node itself will not mount the /tls according to the config in the secret, , so some auth/status commands will fail (exec in csi-node container)

we will take a look at how to solve the problem, thx

for now, you can edit daemonset to mount the secret in the csi-node to fix

@0x1a8510f2
Copy link
Author

Okay great! I can confirm that editing the DaemonSet to mount the secret has worked for now, but hoping to see a fix in an upcoming release. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants