Nvidia Tesla P4 + Scale 21.06 K3s not recognizing GPU

dasaint

Cadet
Joined
Jan 26, 2021
Messages
8
Hello,

hope someone can help me out here and maybe this is my own stupidity but i wanted to get clarification on this.

I have a X9SRI-3F Motherboard with a E5-2660 V2 Chip that has an Onboard VGA GPU (Matrox Electronics MGA G200eW) and its set as the priority device in the BIOS. I have added the Telsa P4 to the system and can see the nvidia-smi pickup the unit no problem but the kubectl is not recognizing the GPU... What could i be missing here?? is it because i don't have 2x Telsa P4s i saw it said i needed 2 GPUs which i do have just not similar. i Picked the Telsa P4 b/c it has 2xNVENC Encoders and lets be honest i got it for a heck of a good deal!

My Endgame is to use the Tesla P4 as a Transcoder for Plex Containers. Is there some output I'm missing that might help troubleshoot this please forgive kinda N00b at the new Scale OS so any help would be greatly appreciated.

Output of nvidia-smi
Code:
truenas# nvidia-smi
Sat Jul 24 13:05:18 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01    Driver Version: 460.73.01    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P4            Off  | 00000000:06:00.0 Off |                    0 |
| N/A   31C    P8     7W /  75W |      0MiB /  7611MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                              
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


Output of k3s kubectl describe nodes
Code:
truenas# k3s kubectl describe nodes
Name:               ix-truenas
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ix-truenas
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=true
                    node-role.kubernetes.io/master=true
                    openebs.io/nodeid=ix-truenas
                    openebs.io/nodename=ix-truenas
Annotations:        csi.volume.kubernetes.io/nodeid: {"zfs.csi.openebs.io":"ix-truenas"}
                    k3s.io/node-args:
                      ["server","--flannel-backend","none","--disable","traefik,metrics-server,local-storage","--disable-kube-proxy","--disable-network-policy",...
                    k3s.io/node-config-hash: BIUIMAIT5RSP6VDWUFNWJISDQUIHGP2EW7PO65M4ARMY7EHXKKGA====
                    k3s.io/node-env: {"K3S_DATA_DIR":"/mnt/Test-Pool/ix-applications/k3s/data/1fda8eac79455ae721508123989e095a50c209cf7965df5630549292f7916941"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sat, 24 Jul 2021 11:53:53 -0700
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ix-truenas
  AcquireTime:     <unset>
  RenewTime:       Sat, 24 Jul 2021 13:08:29 -0700
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sat, 24 Jul 2021 13:08:09 -0700   Sat, 24 Jul 2021 11:53:51 -0700   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sat, 24 Jul 2021 13:08:09 -0700   Sat, 24 Jul 2021 11:53:51 -0700   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sat, 24 Jul 2021 13:08:09 -0700   Sat, 24 Jul 2021 11:53:51 -0700   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Sat, 24 Jul 2021 13:08:09 -0700   Sat, 24 Jul 2021 13:03:06 -0700   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  192.168.10.196
  Hostname:    ix-truenas
Capacity:
  cpu:                16
  ephemeral-storage:  43673984Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             65827544Ki
  pods:               110
Allocatable:
  cpu:                16
  ephemeral-storage:  42486051602
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             65827544Ki
  pods:               110
System Info:
  Machine ID:                 740a67cc6e0240219481b6a240ee3837
  System UUID:                00000000-0000-0000-0000-0cc47a496124
  Boot ID:                    4331e4d6-8526-4ed9-b268-6ca88e0db3ce
  Kernel Version:             5.10.42+truenas
  OS Image:                   Debian GNU/Linux 11 (bullseye)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://20.10.6
  Kubelet Version:            v1.21.0-k3s1
  Kube-Proxy Version:         v1.21.0-k3s1
PodCIDR:                      172.16.0.0/16
PodCIDRs:                     172.16.0.0/16
Non-terminated Pods:          (4 in total)
  Namespace                   Name                         CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                         ------------  ----------  ---------------  -------------  ---
  kube-system                 openebs-zfs-node-mdfp8       0 (0%)        0 (0%)      0 (0%)           0 (0%)         74m
  kube-system                 coredns-7448499f4d-x5bmn     100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     74m
  kube-system                 openebs-zfs-controller-0     0 (0%)        0 (0%)      0 (0%)           0 (0%)         74m
  ix-testplex                 testplex-7dd95c96b8-7644j    0 (0%)        0 (0%)      0 (0%)           0 (0%)         42m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (0%)  0 (0%)
  memory             70Mi (0%)  170Mi (0%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
  Type     Reason                   Age                From     Message
  ----     ------                   ----               ----     -------
  Normal   Starting                 74m                kubelet  Starting kubelet.
  Normal   NodeHasSufficientMemory  74m (x2 over 74m)  kubelet  Node ix-truenas status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    74m (x2 over 74m)  kubelet  Node ix-truenas status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     74m (x2 over 74m)  kubelet  Node ix-truenas status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  74m                kubelet  Updated Node Allocatable limit across pods
  Normal   NodeReady                74m                kubelet  Node ix-truenas status is now: NodeReady
  Normal   Starting                 17m                kubelet  Starting kubelet.
  Normal   NodeHasSufficientMemory  17m                kubelet  Node ix-truenas status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    17m                kubelet  Node ix-truenas status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     17m                kubelet  Node ix-truenas status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  17m                kubelet  Updated Node Allocatable limit across pods
  Normal   NodeNotReady             17m                kubelet  Node ix-truenas status is now: NodeNotReady
  Warning  Rebooted                 17m                kubelet  Node ix-truenas has been rebooted, boot id: c1f9007c-2dd8-497e-911a-446ea15d12b6
  Normal   NodeReady                17m                kubelet  Node ix-truenas status is now: NodeReady
  Normal   Starting                 5m44s              kubelet  Starting kubelet.
  Normal   NodeHasSufficientMemory  5m44s              kubelet  Node ix-truenas status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    5m44s              kubelet  Node ix-truenas status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     5m44s              kubelet  Node ix-truenas status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  5m44s              kubelet  Updated Node Allocatable limit across pods
  Warning  Rebooted                 5m42s              kubelet  Node ix-truenas has been rebooted, boot id: 4331e4d6-8526-4ed9-b268-6ca88e0db3ce
  Normal   NodeNotReady             5m41s              kubelet  Node ix-truenas status is now: NodeNotReady
  Normal   NodeReady                5m31s              kubelet  Node ix-truenas status is now: NodeReady
#                                                                                                                                                                                                                                                                                                                                                                                                                                                             
truenas# 
 

dasaint

Cadet
Joined
Jan 26, 2021
Messages
8
So just an update... Got Docker working with the Tesla P4 and Plex Transcodes, still cannot get K3s working correctly with it but still working on it. i had a lot of streams rolling with the transcodes it was pretty tight (see pics attached)

For those interested... Docker Run File is

sudo docker run \
--name plex \
--net=host \
-p 32400:32400 \
-e PUID=0 \
-e PGID=0 \
-e TZ="US/Central" \
-e VERSION=latest \
-e PLEX_CLAIM="claim-o4beDGGztYzKaBmkeQxH" \
-v /mnt/Test-Pool/Plex-Config:/config \
-v /mnt/Test-Pool/Transcode:/transcode \
-v /mnt/Test-Pool/Videos:/data \
-e NVIDIA_VISIBLE_DEVICES="all" \
-e NVIDIA_DRIVER_CAPABILITIES="compute,video,utility" \
--restart unless-stopped \
--gpus all \
--runtime=nvidia \
linuxserver/plex

and i followed instructions for upgrading nvidia drivers and then did the following..
 

Attachments

  • docker2.jpg
    docker2.jpg
    123.8 KB · Views: 326
  • docker.jpg
    docker.jpg
    202.7 KB · Views: 419
Last edited:

chesterx

Cadet
Joined
Dec 6, 2021
Messages
1
Any updates on using the official k3s Plex App? I don't know why my
Code:
k3s kubectl describe nodes
command keeps showing that there's one 1 gpu but 0 allocatable:

Code:
Capacity:
  cpu:                24
  ephemeral-storage:  64503936Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32797532Ki
  nvidia.com/gpu:     1
  pods:               110
Allocatable:
  cpu:                24
  ephemeral-storage:  62749428892
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32797532Ki
  nvidia.com/gpu:     0
  pods:               110
 

ClassicGOD

Contributor
Joined
Jul 28, 2011
Messages
145
Any updates on using the official k3s Plex App?
Just a stab in the dark - did you go to Apps > Settings > Advanced and checked 'Enable GPU Support' ?
 

DozerD42

Cadet
Joined
Oct 30, 2022
Messages
4
Thank you for the updates!

Do you have the Tesla P4 in the PCie 3.0 x16, x8, or the PCiE 2.0 x4 slot on the X9SRi-3F mobo?

The A2SDi-H-TF board I have has one PCiE 3.0 x4 slot, and I am wondering if buying a Tesla P4 with TrueNAS Scale is going to be worth it/work well for Plex transcode.
 
Top