-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVIDIA iGPU passthrough Support #12525
Comments
thanks, your summary sounds correct to me :) |
From the NVIDIA Container Toolkit side, our goal is to move to CDI as a mechanism to define what is required to allow access to a named Device or Resource (e.g. In the case of OCI-compliant runtimes, the container edits have a well defined mapping to OCI Runtime spec modifications that are defined as part of the Note that CDI spec generation is separate from CDI spec consumption. Here the NVIDIA Container Toolkit includes an The attached image shows a workflow showing the generation and consumption of CDI specifications in the context of OCI-compliant runtimes. In the context of LXD (or other non-OCI-compliant runtimes), what would be required to allow for the injection of NVIDIA Devices including those associated with IGPs is to support reading a CDI spec associated with a particular device, and applying the required modifications to the container. Note that this will also enable the injection of CDI devices by other vendors that support the specification. |
Hello LXD team! I'm from Partner Engineering at Canonical and I'm working on NVIDIA's Tegra line of devices. These are ARM64 devices with an integrated GPU (iGPU) and sometimes an optional discrete GPU (dGPU). We would like to use LXD/LXC with iGPU passthrough for device testing. LXD already supports NVIDIA dGPUs via the
nvidia.runtime=true
flag but iGPU passthrough is not supported at the moment.I've done some initial investigation into how this support could be added and it seems that LXD hands off most of the mounting control to
libnvidia-container
. The call stack as I understand it is as follows:nvidia.runtime=true
LXD::driver_lxc.go
does misc checks on thenvidia
hook, setsNVIDIA_VISIBLE_DEVICES
environment variableLXC::conf.c
runs the shell script/usr/share/lxc/hooks/nvidia
(in treelxc/hooks/nvidia
)nvidia-container-cli
program (part oflibnvidia-container
)NVML
)While doing this investigation NVIDIA has informed me that
libnvidia-container
(providingnvidia-container-cli
) is in the process of being deprecated (public link) and thatlibnvidia-container
(providing thenvidia-ctk
command) is the way forward. I'll also note that NVIDIA is open to supporting this work from their side 🙂As far as I understand the overall scope of work would be to replace
nvidia-container-cli
withnvidia-ctk
and any transitive work that follows from that.The text was updated successfully, but these errors were encountered: