How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

lordofire · 2024-12-16T22:14:53Z

Hi there,

In our use case, we have 1 k8s node (special-designed hardware) with 3 GPUs, 2 GPUs used for container workloads and 1 GPU used for display purposes. In the current setup, all three GPUs are exposed to containers by default. I would like to know how to make nvidia-container-runtime and docker to only expose 2 GPUs by default for any pods scheduled on this node. Specifically:

When the nvidia-device-plugin expose the nvidia.com/gpu to the k8s node capacity, it should be 2 instead of 3.
When the pod is using NVIDIA_VISIBLE_DEVICES=all on the node, it should only see 2 instead of 3.

Note that we could not drain that 1 GPU, since it will still be needed for non-container GPU workloads.

Searched a bit on both this repo as well as other online, but did not find a good solution so far. Thanks in advance for the help.
Jianan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

lordofire commented Dec 16, 2024 •

edited

Loading

How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

How to configure nvidia-container-runtime to expose only certain GPUs from the host to docker #836

Comments

lordofire commented Dec 16, 2024 • edited Loading

lordofire commented Dec 16, 2024 •

edited

Loading