Plex NVIDIA Gpu Passthrough SCALE 21.02

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
Hi, I have just updated to 21.02 and I tried to setup a Plex container with GPU passthrough with the WebUI. I have two GPUs in my system, I tryed to pass my GTX 1070 (which is detected on the debian host) to the Plex container but hw transcoding just doesn't work.
Any help?

Thanks.
Simone
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
We'll need more details than that.. inside the VM does the Nvidia card show up? Do the drivers attach to it properly?
 

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
We'll need more details than that.. inside the VM does the Nvidia card show up? Do the drivers attach to it properly?

its Inside the Plex container, not a VM. First time using docker AND using GPU passthrough at all so I’ll need some guidance to better debug the problem.
What command should I exec inside the container shell to verify if the driver are attached?
Cheers
 

dalnew

Dabbler
Joined
Dec 9, 2020
Messages
26
In your docker if you run
nvidia-smi

Do you see your card come up? Also can you run:
ls -la /dev/nvidia*

Last thing to check would be the plex logs...in the console when you try to transcode something do you see errors like:
1613608784369.png


Also @Kris Moore how hard would it be to modify the Plex App on TrueNAS to enable other configuration options like Host Path Volume Mapping? Right now it's just restricted to data, config, and transcode but I have my directories mapped in from multiple places so just having /data doesn't really work for me. Also I noticed that the GPU passthrough option isn't available as an option in the custom "Launch docker Image" meaning I can't really use that either.
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
Also @Kris Moore how hard would it be to modify the Plex App on TrueNAS to enable other configuration options like Host Path Volume Mapping? Right now it's just restricted to data, config, and transcode but I have my directories mapped in from multiple places so just having /data doesn't really work for me. Also I noticed that the GPU passthrough option isn't available as an option in the custom "Launch docker Image" meaning I can't really use that either.

Other host-paths may be possible. Please put a ticket into jira.ixsystems.com so we can keep track of that request and get it on the dev team's radar.

As for the GPU options, we know that wasn't showing up all the time. If you click "edit" on the container after deployed, it should be at the bottom of the edit pages.
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
its Inside the Plex container, not a VM. First time using docker AND using GPU passthrough at all so I’ll need some guidance to better debug the problem.
What command should I exec inside the container shell to verify if the driver are attached?
Cheers

Ahh, since its a container that changes things. Try this:

# apt update
# apt install nvidia-cuda-dev nvidia-cuda-toolkit

Those packages are 6GB, so they will take a while... After done, reboot.

Once done, click "edit" on your plex container. At the bottom do you see some options for enabling nvidia GPU now?
 

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
Ahh, since its a container that changes things. Try this:

# apt update
# apt install nvidia-cuda-dev nvidia-cuda-toolkit

Those packages are 6GB, so they will take a while... After done, reboot.

Once done, click "edit" on your plex container. At the bottom do you see some options for enabling nvidia GPU now?

Done, at the reboot it seems k3s to be dead

Code:
● k3s.service - Lightweight Kubernetes
     Loaded: loaded (/lib/systemd/system/k3s.service; enabled; vendor preset: disabled)
     Active: activating (auto-restart) (Result: exit-code) since Thu 2021-02-18 17:58:22 CET; 3s ago
       Docs: https://k3s.io
    Process: 21150 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 21151 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
    Process: 21152 ExecStart=/usr/local/bin/k3s server --flannel-backend=none --disable=traefik,metrics-server,local-storage --disable-kube-proxy --disable-network-policy --disable-cloud-controller --node-name=ix>   Main PID: 21152 (code=exited, status=255/EXCEPTION)
      Tasks: 0
     Memory: 816.0K
     CGroup: /system.slice/k3s.service

Feb 18 17:58:22 truenas.local systemd[1]: k3s.service: Main process exited, code=exited, status=255/EXCEPTION
Feb 18 17:58:22 truenas.local systemd[1]: k3s.service: Failed with result 'exit-code'.
Feb 18 17:58:22 truenas.local systemd[1]: k3s.service: Unit process 21540 (zfs) remains running after unit stopped.
Feb 18 17:58:22 truenas.local systemd[1]: Failed to start Lightweight Kubernetes.

 

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
In your docker if you run
nvidia-smi

Do you see your card come up? Also can you run:
ls -la /dev/nvidia*

Last thing to check would be the plex logs...in the console when you try to transcode something do you see errors like:
View attachment 45228

Also @Kris Moore how hard would it be to modify the Plex App on TrueNAS to enable other configuration options like Host Path Volume Mapping? Right now it's just restricted to data, config, and transcode but I have my directories mapped in from multiple places so just having /data doesn't really work for me. Also I noticed that the GPU passthrough option isn't available as an option in the custom "Launch docker Image" meaning I can't really use that either.

EDIT: I deleted the ix-application dataset and let scale to recreate it. I had only 1 container so it was easy.

I have installed nvidia-cuda-dev nvidia-cuda-toolkit

Inside the container if I run nvidia-smi I get:

Code:
OCI runtime exec failed: exec failed: container_linux.go:370: starting container process caused: exec: "/bin/bash nvidia-smi": stat /bin/bash nvidia-smi: no such file or directory: unknown
command terminated with exit code 126
 


For ls -la /dev/nvidia*

Code:
OCI runtime exec failed: exec failed: container_linux.go:370: starting container process caused: exec: "/bin/bash ls -la /dev/nvidia*": stat /bin/bash ls -la /dev/nvidia*: no such file or directory: unknown
command terminated with exit code 126


Of course if I run those commands on the host everything is ok

Code:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:02:00.0 Off |                  N/A |
|  0%   20C    P8     8W / 151W |      2MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


Code:
truenas# ls -la /dev/nvidia*
crw-rw-rw- 1 root root 195, 254 Feb 18 18:28 /dev/nvidia-modeset
crw-rw-rw- 1 root root 239,   0 Feb 18 18:28 /dev/nvidia-uvm
crw-rw-rw- 1 root root 239,   1 Feb 18 18:28 /dev/nvidia-uvm-tools
crw-rw-rw- 1 root root 195,   0 Feb 18 18:24 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Feb 18 18:24 /dev/nvidiactl

/dev/nvidia-caps:
total 0
drwxr-xr-x  2 root root     80 Feb 18 18:24 .
drwxr-xr-x 21 root root   4580 Feb 18 18:28 ..
cr--------  1 root root 247, 1 Feb 18 18:24 nvidia-cap1
cr--r--r--  1 root root 247, 2 Feb 18 18:24 nvidia-cap2


GPU is allocated to the container ofc.
 

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,471
Cool! So I think the issue is trying to run those commands quoted. Can you launch a shell on the container and change it to '/bin/sh' and then manually run the nvidia-smi command at the resulting shell prompt.
 

waqarahmed

iXsystems
iXsystems
Joined
Aug 28, 2019
Messages
136
@SimoneF about the k3s issue, it would be nice if we could have got a debug of the system. If you run into it again, can you please create an issue at jira.ixsystems.com and please attach a debug of the system ?
Also i have not tried but i think you need to set another environment variable "NVIDIA_DRIVER_CAPABILITIES" to "all" and it should work then for the plex application. We have an issue open for this https://jira.ixsystems.com/browse/NAS-109192.
 

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
@SimoneF about the k3s issue, it would be nice if we could have got a debug of the system. If you run into it again, can you please create an issue at jira.ixsystems.com and please attach a debug of the system ?
Also i have not tried but i think you need to set another environment variable "NVIDIA_DRIVER_CAPABILITIES" to "all" and it should work then for the plex application. We have an issue open for this https://jira.ixsystems.com/browse/NAS-109192.
This was the missing point!
Everything is working now, awesome thanks!
 

waqarahmed

iXsystems
iXsystems
Joined
Aug 28, 2019
Messages
136
@dalnew the API supports adding GPU to custom docker creation, it was just not being exposed in the UI. There is a PR for that change in the UI as well and we should have it soon in the nightlies.
 

G8One2

Patron
Joined
Jan 2, 2017
Messages
248
I could never understand the hype or appeal to GPU hardware transcoding. I have no GPU and my system transcodes just fine without it.
 

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
I could never understand the hype or appeal to GPU hardware transcoding. I have no GPU and my system transcodes just fine without it.

Power consumption and performance.

Try multiple streams with transcoding of X265 4k 10bit ~50gb films at the same time with HDR tone mapping, then you'll get why the "hype" of GPU hardware transcoding.
 

G8One2

Patron
Joined
Jan 2, 2017
Messages
248
Power consumption and performance.

Try multiple streams with transcoding of X265 4k 10bit ~50gb films at the same time with HDR tone mapping, then you'll get why the "hype" of GPU hardware transcoding.

Intresting.... My system does that just fine with multiple streams x265 10 bit 4k but it transcodes it down to 1080p or 720p depending on user settings. Why not just have the 1080p version if you're just going to be transcoding 4k content?
 

SimoneF

Explorer
Joined
Feb 9, 2019
Messages
59
Intresting.... My system does that just fine with multiple streams x265 10 bit 4k but it transcodes it down to 1080p or 720p depending on user settings. Why not just have the 1080p version if you're just going to be transcoding 4k content?
Because I'm not the only user.
With HDR tone mapping 2 streams nearly saturate a E5-2697v3 with 14cores. If One of the other VMs requires CPU what happens? The streams will start to lag and all the other VMs will have a significative drop of performance.
If all the transcoding load Is on the GPU then all the other VM won't be affected performance-wise.
 

G8One2

Patron
Joined
Jan 2, 2017
Messages
248
Because I'm not the only user.
With HDR tone mapping 2 streams nearly saturate a E5-2697v3 with 14cores. If One of the other VMs requires CPU what happens? The streams will start to lag and all the other VMs will have a significative drop of performance.
If all the transcoding load Is on the GPU then all the other VM won't be affected performance-wise.

I guess that makes sense. I dont run VM's, its strictly a media server.
 

oumpa31

Patron
Joined
Apr 7, 2015
Messages
253
Ok how do I tell if the transcoding is being done by the GPU. I installed the Plex app, I went to shell and ran; apt update, apt install nvidia-cuda-dev nvidia-cuda-toolkit, apt upgrade. Everything installed properly.
Then I stopped the app and edited Environment Variable NAME: NVIDIA_DRIVER_CAPABILITIES Value: all and set GPU Resource (nvidia.com/gpu) to Allocate 1 nvidia.com/gpu
GPU. My P4000 shows up when I run nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P4000 Off | 00000000:01:00.0 Off | N/A |
| 46% 32C P8 9W / 105W | 0MiB / 8117MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

and when i run ls -la /dev/nvidia* I get
crw-rw-rw- 1 root root 195, 254 Mar 11 21:23 /dev/nvidia-modeset
crw-rw-rw- 1 root root 195, 0 Mar 11 21:23 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Mar 11 21:23 /dev/nvidiactl
 
Top