You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The gpu_operator pod goes crash loop backoff.
Logs:
...
{"level":"info","ts":1701345521.9651258,"logger":"controllers.ClusterPolicy","msg":"ClusterPolicy step completed","state:":"state-sandbox-device-plugin","status":"disabled"}
{"level":"info","ts":1701345522.0023656,"logger":"controllers.ClusterPolicy","msg":"Kata Manager disabled, deleting all Kata RuntimeClasses"}
{"level":"info","ts":1701345522.0023997,"logger":"controllers.ClusterPolicy","msg":"ClusterPolicy step completed","state:":"state-kata-manager","status":"disabled"}
{"level":"info","ts":1701345522.0334861,"logger":"controllers.ClusterPolicy","msg":"ClusterPolicy step completed","state:":"state-cc-manager","status":"disabled"}
{"level":"info","ts":1701345522.0335908,"logger":"controllers.ClusterPolicy","msg":"ClusterPolicy is ready as all resources have been successfully reconciled"}
{"level":"error","ts":1701345527.9915156,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345537.9906301,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345547.9917858,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345557.9914362,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345567.990562,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345577.991085,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345587.9907615,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
{"level":"error","ts":1701345597.9910758,"logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: nvidia.com/v1alpha1: the server could not find the requested resource"}
1. Quick Debug Information
2. Issue or feature description
The gpu_operator pod goes crash loop backoff.
Logs:
3. Steps to reproduce the issue
Just install the helm chart
The text was updated successfully, but these errors were encountered: