passthrough support in runv #680

telala · 2018-05-15T02:29:37Z

I am now working on a program that need to passthrough gpu in runv.
Following is my steps:

add "-device", "vfio-pci,host=0000:08:00.0,id=gpu_0,bus=pci.0,addr=0xf" in amd_64.go
start a runv container
in the container run command: insmod nvidia.ko insmod nvidia-modeset.ko insmod nvidia-uvm.ko insmod nvidia-drm.ko
I can get the following dmesg from the container:
[ 222.610227] nvidia: loading out-of-tree module taints kernel.
[ 222.610854] nvidia: module license 'NVIDIA' taints kernel.
[ 222.611461] Disabling lock debugging due to kernel taint
[ 222.625106] nvidia-nvlink: Nvlink Core is being initialized, major device number 240
[ 222.656048] chenxg: load driver:nvidia
[ 222.656435] chenxg: gpu driver loaded
[ 222.656839] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.46 Fri Mar 16 22:24:50 PDT 2018 (using threaded interrupts)
[ 233.260423] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 239
[ 239.616160] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.46 Fri Mar 16 21:46:30 PDT 2018
[ 246.169710] [drm] [nvidia-drm] [GPU ID 0x0000000f] Loading driver
[ 246.170349] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:00:0f.0 on minor 0

I compared these logs with the host which have nvidia gpu installed they are exactly the same.

One issue I suspected is that I insmod the nvidia.ko in the container.
Maybe I should insmod nvidia.ko in the hyperstart. I tried to insmod nvidia.ko in the main function in hyperstart but there's no insmod command.
Then I copied the insmod command to the hyperstart and got another error:
/insmod: error while loading shared libraries: liblzma.so.5: cannot open shared object file: No such file or directory
Can you give me some advice? Thanks very much:)

telala · 2018-05-15T04:34:57Z

@bergwolf From the above message I can see the minor 0 for the gpu device.
Now I can run command insmod nvidia.ko in hyperstart. Still there are no /dev/nvidia0 and /dev/nvidiactl.

bergwolf · 2018-05-15T05:01:03Z

@telala There's no difference calling insmod from a container or from hyperstart.

I'm not sure how nvidia creates /dev/nvidia0 and /dev/nvidiactl. Does the nvidia driver package install some udev rules?

Since you can see minor 0, you should be able to call mknod to create the device. But it only represents one device (either nvidia0 or nvidiactl). I'm not sure how to create the other one. Can you run on your host ls -l /dev | grep nvidia and paste the results here?

telala · 2018-05-15T06:16:36Z

localhost#ls -l /dev | grep nvidia
crw-rw-rw- 1 root root 195, 0 May 15 20:36 nvidia0
crw-rw-rw- 1 root root 195, 1 May 15 20:36 nvidia1
crw-rw-rw- 1 root root 195, 255 May 15 20:36 nvidiactl

I just tried to install the nvidia driver on my host again and found the /dev/nvidia0 /dev/nvidiactl nodes were not created after nvidia driver was installed.
when I run the nvidia-smi command to test the nvidia driver then the /dev/vidia0 and /dev/nvidiactl nodes were created.
A lot of libraries had been installed when installing the nvidia in host. Maybe I should copy all the nvidia files in host to hyperstart. How do you think? @bergwolf

bergwolf · 2018-05-15T06:39:06Z

@telala I think you can first try to copy these files to your container and see if it works from there. Likely they do not need to live inside hyperstart.

gnawux · 2018-05-15T07:17:00Z

@bergwolf does hyperstart need share some device files under /dev/ to the container?

bergwolf · 2018-05-15T07:28:03Z

@gnawux hyperstart shares the same devtmpfs superblock as containers. Any device that hyperstart sees under /dev is present to containers as well.

telala · 2018-05-16T05:44:38Z

@bergwolf @gnawux I added all the user level nvidia files to a container image and now I can run nvidia command nvidia-smi in container now and the /dev/nvidia0 and /dev/nvidiactl were created also.

telala mentioned this issue May 15, 2018

can I specify the runtime used for the container runv created behind the vm? #676

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

passthrough support in runv #680

passthrough support in runv #680

telala commented May 15, 2018

telala commented May 15, 2018

bergwolf commented May 15, 2018

telala commented May 15, 2018

bergwolf commented May 15, 2018

gnawux commented May 15, 2018

bergwolf commented May 15, 2018

telala commented May 16, 2018

passthrough support in runv #680

passthrough support in runv #680

Comments

telala commented May 15, 2018

telala commented May 15, 2018

bergwolf commented May 15, 2018

telala commented May 15, 2018

bergwolf commented May 15, 2018

gnawux commented May 15, 2018

bergwolf commented May 15, 2018

telala commented May 16, 2018