No gpu available in Applications

zmalqp

Cadet
Joined
Nov 5, 2022
Messages
2
Hi,
I'm having a issues relating to the gpu support inside trunas scale, basically none of the application can use or see the gpu (Rtx 3060 )


Here some information:

I'm running truenas TrueNAS-SCALE-22.02.4

The gpu is not isolated:
1667664581579.png


Application gpu menu:
1667664142814.png


This does not wok
1667664193602.png


Command "systemctl status systemd-modules-load.service"

Code:
Load Kernel Modules
     Loaded: loaded (/lib/systemd/system/systemd-modules-load.service; static)
     Active: active (exited) since Wed 2022-11-02 02:23:56 CET; 3 days ago
       Docs: man:systemd-modules-load.service(8)
             man:modules-load.d(5)
   Main PID: 3267 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 18439)
     Memory: 0B
        CPU: 0
     CGroup: /system.slice/systemd-modules-load.service

Warning: journal has been rotated since unit was started, output may be incomplete.


Command " nvidia-smi" (Everything looks ok)

Code:
Sat Nov  5 19:04:34 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:B3:00.0 Off |                  N/A |
| 35%   36C    P0    40W / 170W |      0MiB / 12053MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


Command "lspci | grep NVIDIA"

Code:
0000:02:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)
0000:02:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1)
0000:b3:00.0 VGA compatible controller: NVIDIA Corporation Device 2504 (rev a1)
0000:b3:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)


Command "k3s kubectl describe nodes" (Note the "nvidia.com/gpu: 0" line )
Code:
Name:               ix-truenas
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    egress.k3s.io/cluster=true
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ix-truenas
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=true
                    node-role.kubernetes.io/master=true
                    openebs.io/nodeid=ix-truenas
                    openebs.io/nodename=ix-truenas
Annotations:        csi.volume.kubernetes.io/nodeid: {"zfs.csi.openebs.io":"ix-truenas"}
                    k3s.io/node-args:
                      ["server","--cluster-cidr","172.16.0.0/16","--cluster-dns","172.17.0.10","--data-dir","/mnt/NCC-1647/ix-applications/k3s","--kube-apiserve...
                    k3s.io/node-config-hash: 4WLIOEEHXIUWUFC3B5ZWAWEEBA4CTYN7DE5CWRAURRH3F2SBQIXA====
                    k3s.io/node-env: {"K3S_DATA_DIR":"/mnt/NCC-1647/ix-applications/k3s/data/c7ce83f20ceffcbc9abbeb57698113658d664374455f25106875b3393927f7ce"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 05 Oct 2022 15:15:21 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ix-truenas
  AcquireTime:     <unset>
  RenewTime:       Sat, 05 Nov 2022 19:07:10 +0100
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sat, 05 Nov 2022 19:04:06 +0100   Thu, 27 Oct 2022 13:15:25 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sat, 05 Nov 2022 19:04:06 +0100   Thu, 27 Oct 2022 13:15:25 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sat, 05 Nov 2022 19:04:06 +0100   Thu, 27 Oct 2022 13:15:25 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Sat, 05 Nov 2022 19:04:06 +0100   Wed, 02 Nov 2022 02:25:28 +0100   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  192.168.2.200
  Hostname:    ix-truenas
Capacity:
  cpu:                12
  ephemeral-storage:  420037504Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16069172Ki
  nvidia.com/gpu:     0
  pods:               250
Allocatable:
  cpu:                12
  ephemeral-storage:  408612483571
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             16069172Ki
  nvidia.com/gpu:     0
  pods:               250
System Info:
  Machine ID:                    9847b502f1aa4eafa45e91b5b05ed754
  System UUID:                   03d502e0-045e-05ad-0c06-6b0700080009
  Boot ID:                       c2a74b26-ed1e-4380-8a06-e262c7835569
  Kernel Version:                5.10.142+truenas
  OS Image:                      Debian GNU/Linux 11 (bullseye)
  Operating System:              linux
  Architecture:                  amd64
  Container Runtime Version:     docker://Unknown
  Kubelet Version:               v1.23.5+k3s-fbfa51e5-dirty
  Kube-Proxy Version:            v1.23.5+k3s-fbfa51e5-dirty
PodCIDR:                         172.16.0.0/16
PodCIDRs:                        172.16.0.0/16
Non-terminated Pods:             (34 in total)
  Namespace                      Name                                          CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                      ----                                          ------------  ----------  ---------------  -------------  ---
  kube-system                    coredns-d76bd69b-rpsgh                        100m (0%)     0 (0%)      70Mi (0%)        170Mi (1%)     3d16h
  ix-traefik                     traefik-778bbdbc64-c579h                      10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  kube-system                    nvidia-device-plugin-daemonset-4dv5n          0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-k8s-gateway                 k8s-gateway-74b5579c44-tglkf                  10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  kube-system                    svclb-satisfactory-query-eab207c0-tp84g       0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-traefik                     svclb-traefik-wf7vv                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-gittea                      svclb-gittea-gitea-ssh-fkp69                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-pihole                      svclb-pihole-dns-tcp-6xwkv                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  kube-system                    svclb-satisfactory-c62002de-7fhvt             0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-k8s-gateway                 svclb-k8s-gateway-22gz7                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  kube-system                    svclb-satisfactory-beacon-7ea4217b-7kzhh      0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-k8s-gateway                 k8s-gateway-74b5579c44-sp2t9                  10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-home-assistant              svclb-home-assistant-4pp4t                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-traefik                     svclb-traefik-tcp-m6gfz                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-gittea                      gittea-memcached-77589bccdc-xd84m             10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-esphome                     svclb-esphome-cvhp5                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  kube-system                    openebs-zfs-node-h5mz2                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-pihole                      svclb-pihole-zqjx5                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-pihole                      svclb-pihole-dns-42lfk                        0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  kube-system                    openebs-zfs-controller-0                      0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d16h
  ix-kavita                      kavita-659d87895c-wf6p9                       10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-gittea                      gittea-postgresql-0                           10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-pihole                      pihole-86bbffddc7-n9wnq                       10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-esphome                     esphome-67797c88f9-9pc98                      10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-nextcloud                   nextcloud-redis-0                             10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-onlyoffice-document-server  onlyoffice-document-server-postgresql-0       10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-onlyoffice-document-server  onlyoffice-document-server-redis-0            10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-home-assistant              home-assistant-postgresql-0                   10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-gittea                      gittea-gitea-86d47567c7-z5j6k                 10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-nextcloud                   nextcloud-postgresql-0                        10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-onlyoffice-document-server  onlyoffice-document-server-f958bff5f-j5x7x    10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-home-assistant              home-assistant-7bf7dc5f59-hnl9d               10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-nextcloud                   nextcloud-6f55649b69-ncn87                    10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      3d16h
  ix-jellyfin                    jellyfin-7596f64f8d-zxk2g                     10m (0%)      4 (33%)     50Mi (0%)        8Gi (52%)      2d1h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                280m (2%)   72 (600%)
  memory             970Mi (6%)  147626Mi (940%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
  nvidia.com/gpu     0           0
Events:              <none>


Does anyone have any idea how I can fix this issue?

Thanks for the support :wink:
 

zmalqp

Cadet
Joined
Nov 5, 2022
Messages
2
Meanwhile i've tried some things:

Running
Code:
modprobe nvidia-current-uvm && /usr/bin/nvidia-modprobe -c0 -u


Updating to Bluefin (Beta) (This update also broke the samba configurations, so I preferred to roll back)

Change the primary gpu display from the bios

But none of the above solution has solved the problem o_O
 
Top