zmalqp
Cadet
- Joined
- Nov 5, 2022
- Messages
- 2
Hi,
I'm having a issues relating to the gpu support inside trunas scale, basically none of the application can use or see the gpu (Rtx 3060 )
Here some information:
I'm running truenas TrueNAS-SCALE-22.02.4
The gpu is not isolated:
Application gpu menu:
This does not wok
Command "systemctl status systemd-modules-load.service"
Command " nvidia-smi" (Everything looks ok)
Command "lspci | grep NVIDIA"
Command "k3s kubectl describe nodes" (Note the "nvidia.com/gpu: 0" line )
Does anyone have any idea how I can fix this issue?
Thanks for the support
I'm having a issues relating to the gpu support inside trunas scale, basically none of the application can use or see the gpu (Rtx 3060 )
Here some information:
I'm running truenas TrueNAS-SCALE-22.02.4
The gpu is not isolated:
Application gpu menu:
This does not wok
Command "systemctl status systemd-modules-load.service"
Code:
Load Kernel Modules
Loaded: loaded (/lib/systemd/system/systemd-modules-load.service; static)
Active: active (exited) since Wed 2022-11-02 02:23:56 CET; 3 days ago
Docs: man:systemd-modules-load.service(8)
man:modules-load.d(5)
Main PID: 3267 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 18439)
Memory: 0B
CPU: 0
CGroup: /system.slice/systemd-modules-load.service
Warning: journal has been rotated since unit was started, output may be incomplete.
Command " nvidia-smi" (Everything looks ok)
Code:
Sat Nov 5 19:04:34 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:B3:00.0 Off | N/A | | 35% 36C P0 40W / 170W | 0MiB / 12053MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Command "lspci | grep NVIDIA"
Code:
0000:02:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1) 0000:02:00.1 Audio device: NVIDIA Corporation GF119 HDMI Audio Controller (rev a1) 0000:b3:00.0 VGA compatible controller: NVIDIA Corporation Device 2504 (rev a1) 0000:b3:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)
Command "k3s kubectl describe nodes" (Note the "nvidia.com/gpu: 0" line )
Code:
Name: ix-truenas
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
egress.k3s.io/cluster=true
kubernetes.io/arch=amd64
kubernetes.io/hostname=ix-truenas
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=true
node-role.kubernetes.io/master=true
openebs.io/nodeid=ix-truenas
openebs.io/nodename=ix-truenas
Annotations: csi.volume.kubernetes.io/nodeid: {"zfs.csi.openebs.io":"ix-truenas"}
k3s.io/node-args:
["server","--cluster-cidr","172.16.0.0/16","--cluster-dns","172.17.0.10","--data-dir","/mnt/NCC-1647/ix-applications/k3s","--kube-apiserve...
k3s.io/node-config-hash: 4WLIOEEHXIUWUFC3B5ZWAWEEBA4CTYN7DE5CWRAURRH3F2SBQIXA====
k3s.io/node-env: {"K3S_DATA_DIR":"/mnt/NCC-1647/ix-applications/k3s/data/c7ce83f20ceffcbc9abbeb57698113658d664374455f25106875b3393927f7ce"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 05 Oct 2022 15:15:21 +0200
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: ix-truenas
AcquireTime: <unset>
RenewTime: Sat, 05 Nov 2022 19:07:10 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Sat, 05 Nov 2022 19:04:06 +0100 Thu, 27 Oct 2022 13:15:25 +0200 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sat, 05 Nov 2022 19:04:06 +0100 Thu, 27 Oct 2022 13:15:25 +0200 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sat, 05 Nov 2022 19:04:06 +0100 Thu, 27 Oct 2022 13:15:25 +0200 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sat, 05 Nov 2022 19:04:06 +0100 Wed, 02 Nov 2022 02:25:28 +0100 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.2.200
Hostname: ix-truenas
Capacity:
cpu: 12
ephemeral-storage: 420037504Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16069172Ki
nvidia.com/gpu: 0
pods: 250
Allocatable:
cpu: 12
ephemeral-storage: 408612483571
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16069172Ki
nvidia.com/gpu: 0
pods: 250
System Info:
Machine ID: 9847b502f1aa4eafa45e91b5b05ed754
System UUID: 03d502e0-045e-05ad-0c06-6b0700080009
Boot ID: c2a74b26-ed1e-4380-8a06-e262c7835569
Kernel Version: 5.10.142+truenas
OS Image: Debian GNU/Linux 11 (bullseye)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://Unknown
Kubelet Version: v1.23.5+k3s-fbfa51e5-dirty
Kube-Proxy Version: v1.23.5+k3s-fbfa51e5-dirty
PodCIDR: 172.16.0.0/16
PodCIDRs: 172.16.0.0/16
Non-terminated Pods: (34 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system coredns-d76bd69b-rpsgh 100m (0%) 0 (0%) 70Mi (0%) 170Mi (1%) 3d16h
ix-traefik traefik-778bbdbc64-c579h 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
kube-system nvidia-device-plugin-daemonset-4dv5n 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-k8s-gateway k8s-gateway-74b5579c44-tglkf 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
kube-system svclb-satisfactory-query-eab207c0-tp84g 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-traefik svclb-traefik-wf7vv 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-gittea svclb-gittea-gitea-ssh-fkp69 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-pihole svclb-pihole-dns-tcp-6xwkv 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
kube-system svclb-satisfactory-c62002de-7fhvt 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-k8s-gateway svclb-k8s-gateway-22gz7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
kube-system svclb-satisfactory-beacon-7ea4217b-7kzhh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-k8s-gateway k8s-gateway-74b5579c44-sp2t9 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-home-assistant svclb-home-assistant-4pp4t 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-traefik svclb-traefik-tcp-m6gfz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-gittea gittea-memcached-77589bccdc-xd84m 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-esphome svclb-esphome-cvhp5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
kube-system openebs-zfs-node-h5mz2 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-pihole svclb-pihole-zqjx5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-pihole svclb-pihole-dns-42lfk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
kube-system openebs-zfs-controller-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d16h
ix-kavita kavita-659d87895c-wf6p9 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-gittea gittea-postgresql-0 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-pihole pihole-86bbffddc7-n9wnq 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-esphome esphome-67797c88f9-9pc98 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-nextcloud nextcloud-redis-0 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-onlyoffice-document-server onlyoffice-document-server-postgresql-0 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-onlyoffice-document-server onlyoffice-document-server-redis-0 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-home-assistant home-assistant-postgresql-0 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-gittea gittea-gitea-86d47567c7-z5j6k 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-nextcloud nextcloud-postgresql-0 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-onlyoffice-document-server onlyoffice-document-server-f958bff5f-j5x7x 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-home-assistant home-assistant-7bf7dc5f59-hnl9d 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-nextcloud nextcloud-6f55649b69-ncn87 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 3d16h
ix-jellyfin jellyfin-7596f64f8d-zxk2g 10m (0%) 4 (33%) 50Mi (0%) 8Gi (52%) 2d1h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 280m (2%) 72 (600%)
memory 970Mi (6%) 147626Mi (940%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
nvidia.com/gpu 0 0
Events: <none>
Does anyone have any idea how I can fix this issue?
Thanks for the support