Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E tests with GPU enabled #144

Open
asm582 opened this issue Oct 7, 2024 · 0 comments
Open

E2E tests with GPU enabled #144

asm582 opened this issue Oct 7, 2024 · 0 comments

Comments

@asm582
Copy link
Contributor

asm582 commented Oct 7, 2024

KinD cluster inside the e2e test suite cannot load the GPU libraries to inject GPUs. Below is the log snippet from the daemonset logs:

2024-10-07T14:31:10Z    INFO    Starting workers        {"controller": "InstaSliceDaemonSet", "controllerGroup": "inference.codeflare.dev", "controllerKind": "Instaslice", "worker count": 1}
2024-10-07T14:31:10Z    ERROR   error discovering GPUs  {"error": "ERROR_LIBRARY_NOT_FOUND"}

There is a need to figure out the correct KinD configuration, the current KinD config used for deployment is:

		fmt.Println("Setting up Kind cluster")
		kindConfig := `
		apiVersion: kind.x-k8s.io/v1alpha4
		kind: Cluster
		nodes:
		- role: control-plane
		image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
		# required for GPU workaround
		extraMounts:
		- hostPath: /dev/null
			containerPath: /var/run/nvidia-container-devices/all
`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant