Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to get sandbox runtime: no runtime for "spin" is configured (vanilla Kubernetes) #165

Closed
hotspoons opened this issue Oct 11, 2023 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@hotspoons
Copy link

I have the spin shim installed on my worker nodes:
image

And the spin containerd plugin configured on my worker nodes:
image

The runtime class configured on my cluster:
image

This is the deployment config for the hello world app I was trying to get working:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"wasm-spin","namespace":"default"},"spec":{"replicas":3,"selector":{"matchLabels":{"app":"wasm-spin"}},"template":{"metadata":{"labels":{"app":"wasm-spin"}},"spec":{"containers":[{"image":"ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:latest","name":"testwasm"}],"runtimeClassName":"wasmtime-spin-v1"}}}}
  creationTimestamp: "2023-10-11T01:42:09Z"
  generation: 1
  name: wasm-spin
  namespace: default
  resourceVersion: "75957"
  uid: 3104ba94-3ceb-496c-b7b3-23e6472500f3
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: wasm-spin
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: wasm-spin
    spec:
      containers:
      - image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:latest
        imagePullPolicy: Always
        name: testwasm
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      runtimeClassName: wasmtime-spin-v1
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  conditions:
  - lastTransitionTime: "2023-10-11T01:42:09Z"
    lastUpdateTime: "2023-10-11T01:42:09Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2023-10-11T01:42:09Z"
    lastUpdateTime: "2023-10-11T01:56:03Z"
    message: ReplicaSet "wasm-spin-58db6df759" is progressing.
    reason: ReplicaSetUpdated
    status: "True"
    type: Progressing
  observedGeneration: 1
  replicas: 3
  unavailableReplicas: 3
  updatedReplicas: 3

But this is what I get when trying to deploy any of the pods:

image

Did I miss something? I tried with the fermyon helm chart and binaries from here, same result. Any help is appreciated. Thanks!

@Mossaka
Copy link
Member

Mossaka commented Oct 21, 2023

Sorry for the late reply. Given the screenshots I am not entirely sure what the issue was. Hence a few questions from my side to aid debugging:

  1. Could you please check if the shim binary is indeed in the PATH of the worker nodes where containerd can find?
  2. Have you restarted containerd after you changed it's config.toml?
  3. Can you please also check containerd logs to see if there are any interesging stuff? Please paste it here if you found something interesting.

@Mossaka Mossaka added the help wanted Extra attention is needed label Oct 21, 2023
@hotspoons
Copy link
Author

Sorry for the late reply. Given the screenshots I am not entirely sure what the issue was. Hence a few questions from my side to aid debugging:

  1. Could you please check if the shim binary is indeed in the PATH of the worker nodes where containerd can find?
  2. Have you restarted containerd after you changed it's config.toml?
  3. Can you please also check containerd logs to see if there are any interesging stuff? Please paste it here if you found something interesting.

Thank you for the help! I dug into the containerd logs as you recommended and that led me down a rabbit hole that ended me up on the Nvidia GPU operator GitHub issues page, where I found this issue, and ultimately following the Configure cgroups section from this guide made the wasm shim work for my configuration.

Since this issue will be present on any out-of-the-box Enterprise Linux variant (e.g. Redhat, CentOS, Rocky, Alma) Kubernetes cluster, I would be happy to open a PR to add a note of warning to your documentation, hopefully it will save frustration for anyone who comes after me. Thank you!

@Mossaka
Copy link
Member

Mossaka commented Oct 23, 2023

I would be happy to open a PR to add a note of warning to your documentation

That would be great! I am extremely happy to hear that you were able to figure out the issue and made it work! 🥳

@chokosabe
Copy link

Hi @hotspoons, I think I've run into this same issue using Rocky Linux 9. Was the cgroups Configuration the only thing you changed? Applied the changes to no effect.

Thanks in advance.

A.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants