allan.tatter
Cadet
- Joined
- Nov 10, 2023
- Messages
- 3
I have a fresh install of TrueNAS SCALE on my old PC. I have the following hardware:
However when I try to deploy any custom app (any public image from Docker Hub) the state stays at "Deploying" and never gets to "Running". Under the custom app Details section I have a recurring following Related Kubernetes Event:
With kubectl I see the following:
Successfully deployed nginx-proxy-manager app from Discover apps section. Probably related to not having `amd.com/gpu` in the Limits and Requests section.
Additional debugging info:
Do I need to do anything special about the external GPU when doing a fresh install of TrueNAS SCALE? Maybe installing drivers or anything like that? I have another machine with different hardware and with only integrated graphic and no issues there.
- CPU: Intel Core i5-8400 (with integrated graphics)
- Mother board: GIGABYTE H370M DS3H
- Graphics card: AMD Radeon HD 7970/8970 OEM / R9 280X
However when I try to deploy any custom app (any public image from Docker Hub) the state stays at "Deploying" and never gets to "Running". Under the custom app Details section I have a recurring following Related Kubernetes Event:
Code:
Allocate failed due to no healthy devices present; cannot allocate unhealthy devices amd.com/gpu, which is unexpected
With kubectl I see the following:
Code:
$ kubectl -n ix-whoami4 get pods
NAME READY STATUS RESTARTS AGE
whoami4-ix-chart-749c8ff779-mbrqp 0/1 UnexpectedAdmissionError 0 91s
whoami4-ix-chart-749c8ff779-hf824 0/1 UnexpectedAdmissionError 0 90s
whoami4-ix-chart-749c8ff779-htw88 0/1 UnexpectedAdmissionError 0 89s
whoami4-ix-chart-749c8ff779-lghjc 0/1 UnexpectedAdmissionError 0 89s
whoami4-ix-chart-749c8ff779-9zmz7 0/1 UnexpectedAdmissionError 0 87s
whoami4-ix-chart-749c8ff779-p6g2v 0/1 UnexpectedAdmissionError 0 85s
whoami4-ix-chart-749c8ff779-jnh9m 0/1 Pending 0 84s`````$ kubectl -n ix-whoami4 describe pods
Name: whoami4-ix-chart-749c8ff779-mbrqp
Namespace: ix-whoami4
Priority: 0
Node: ix-truenas/
Start Time: Sun, 12 Nov 2023 19:41:04 +0200
Labels: app.kubernetes.io/instance=whoami4
app.kubernetes.io/name=ix-chart
pod-template-hash=749c8ff779
Annotations: rollme: dcArL
Status: Failed
Reason: UnexpectedAdmissionError
Message: Pod was rejected: Allocate failed due to no healthy devices present; cannot allocate unhealthy devices amd.com/gpu, which is unexpected
IP:
IPs: <none>
Controlled By: ReplicaSet/whoami4-ix-chart-749c8ff779
Containers:
ix-chart:
Image: traefik/whoami:latest
Port: 80/TCP
Host Port: 0/TCP
Limits:
amd.com/gpu: 0
gpu.intel.com/i915: 0
nvidia.com/gpu: 0
Requests:
amd.com/gpu: 0
gpu.intel.com/i915: 0
nvidia.com/gpu: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4xrh6 (ro)
Volumes:
kube-api-access-4xrh6:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 103s default-scheduler Successfully assigned ix-whoami4/whoami4-ix-chart-749c8ff779-mbrqp to ix-truenas
Warning UnexpectedAdmissionError 103s kubelet Allocate failed due to no healthy devices present; cannot allocate unhealthy devices amd.com/gpu, which is unexpected
Successfully deployed nginx-proxy-manager app from Discover apps section. Probably related to not having `amd.com/gpu` in the Limits and Requests section.
Code:
$ kubectl -n ix-nginx-proxy-manager get pods
NAME READY STATUS RESTARTS AGE
nginx-proxy-manager-747c57ddf4-qnvfk 1/1 Running 0 119s``````$ kubectl -n ix-nginx-proxy-manager describe pods nginx-proxy-manager-747c57ddf4-qnvfk
Name: nginx-proxy-manager-747c57ddf4-qnvfk
Namespace: ix-nginx-proxy-manager
Priority: 0
Node: ix-truenas/192.168.1.10
Start Time: Sun, 12 Nov 2023 20:32:25 +0200
Labels: app=nginx-proxy-manager-1.0.18
app.kubernetes.io/instance=nginx-proxy-manager
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nginx-proxy-manager
app.kubernetes.io/version=2.10.4
helm-revision=1
helm.sh/chart=nginx-proxy-manager-1.0.18
pod-template-hash=747c57ddf4
pod.name=npm
release=nginx-proxy-manager
Annotations: k8s.v1.cni.cncf.io/network-status:
[{
"name": "ix-net",
"interface": "eth0",
"ips": [
"172.16.0.161"
],
"mac": "3e:a4:04:04:81:b6",
"default": true,
"dns": {},
"gateway": [
"172.16.0.1"
]
}]
rollme: uVBno
Status: Running
IP: 172.16.0.161
IPs:
IP: 172.16.0.161
Controlled By: ReplicaSet/nginx-proxy-manager-747c57ddf4
Containers:
nginx-proxy-manager:
Container ID: containerd://0528182aff475d4963bb237f5e6fb2708a9850bb5fb62a268533bb21f249d5e2
Image: jc21/nginx-proxy-manager:2.10.4
Image ID: docker.io/jc21/nginx-proxy-manager@sha256:e1000dd653d193ac70cb3635c27333b0183a11f987e2b1c6043589d9d948bc0f
Ports: 80/TCP, 443/TCP, 81/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Sun, 12 Nov 2023 20:32:26 +0200
Ready: True
Restart Count: 0
Limits:
cpu: 4
memory: 8Gi
Requests:
cpu: 10m
memory: 50Mi
Liveness: exec [/bin/check-health] delay=10s timeout=5s period=10s #success=1 #failure=5
Readiness: exec [/bin/check-health] delay=10s timeout=5s period=10s #success=2 #failure=5
Startup: exec [/bin/check-health] delay=30s timeout=2s period=5s #success=1 #failure=120
Environment:
TZ: Europe/Tallinn
UMASK: 002
UMASK_SET: 002
NVIDIA_VISIBLE_DEVICES: void
PUID: 1000
USER_ID: 1000
UID: 1000
PGID: 1000
GROUP_ID: 1000
GID: 1000
DB_SQLITE_FILE: /data/database.sqlite
DISABLE_IPV6: true
Mounts:
/data from data (rw)
/etc/letsencrypt from certs (rw)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
certs:
Type: HostPath (bare host directory volume)
Path: /mnt/pool-2/ix-applications/releases/nginx-proxy-manager/volumes/ix_volumes/certs
HostPathType:
data:
Type: HostPath (bare host directory volume)
Path: /mnt/pool-2/ix-applications/releases/nginx-proxy-manager/volumes/ix_volumes/data
HostPathType:
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 86s default-scheduler Successfully assigned ix-nginx-proxy-manager/nginx-proxy-manager-747c57ddf4-qnvfk to ix-truenas
Normal AddedInterface 86s multus Add eth0 [172.16.0.161/16] from ix-net
Normal Pulled 86s kubelet Container image "jc21/nginx-proxy-manager:2.10.4" already present on machine
Normal Created 85s kubelet Created container nginx-proxy-manager
Normal Started 85s kubelet Started container nginx-proxy-manager
Warning Unhealthy 46s (x2 over 51s) kubelet Startup probe failed: NOT OK
Additional debugging info:
Code:
$ lspci01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tahiti XT [Radeon HD 7970/8970 OEM / R9 280X]
Code:
$ lsmodgpu_sched 53248 1 amdgpu drm_buddy 20480 2 amdgpu,i915 drm_display_helper 184320 3 amdgpu,radeon,i915 drm_ttm_helper 16384 2 amdgpu,radeon i2c_algo_bit 16384 3 amdgpu,radeon,i915 video 65536 3 amdgpu,radeon,i915
Do I need to do anything special about the external GPU when doing a fresh install of TrueNAS SCALE? Maybe installing drivers or anything like that? I have another machine with different hardware and with only integrated graphic and no issues there.