Today after a restart, none of my apps will start. k3s says the node is ready, but there is a `not-ready` taint, and the logs look like containers are trying to start but can not be accessed. I've browsed these forums for a few hours and couldn't find any relevant posts, but here is what I'm seeing:
I have tried restarting multiple times, restoring from a recent config backup, unsetting and re-setting the app pool, and nothing seems to work. I'm at a loss for what to try next. Sometimes it will show App Services started, but spam logs with "failed to connect" when doing the container health checks. It's not super consistent.
Code:
Failed to start kubernetes cluster for Applications: [EFAULT] Kube-router routes not applied as timed out waiting for pods to execute
Code:
root@truenas[~]# k3s kubectl get nodes NAME STATUS ROLES AGE VERSION ix-truenas Ready control-plane,master 406d v1.26.6+k3s-e18037a7-dirty
Code:
root@truenas[~]# k3s kubectl describe node ix-truenas
Name: ix-truenas
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=ix-truenas
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=true
node-role.kubernetes.io/master=true
openebs.io/nodeid=ix-truenas
openebs.io/nodename=ix-truenas
Annotations: csi.volume.kubernetes.io/nodeid: {"zfs.csi.openebs.io":"ix-truenas"}
k3s.io/node-args:
["server","--cluster-cidr","172.16.0.0/16","--cluster-dns","172.17.0.10","--data-dir","/mnt/tank/ix-applications/k3s","--disable","metrics...
k3s.io/node-config-hash: 5VMKXMDJNBI2D5KDF52SDV37V4ZY2EIOJXZDRLUULQXDPJX5RA4Q====
k3s.io/node-env: {"K3S_DATA_DIR":"/mnt/tank/ix-applications/k3s/data/6c243f7cbf543e01911aa24f7651922820ca56e79179e8fd215a3e4381aceecf"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 29 Nov 2022 22:25:48 -0500
Taints: node.kubernetes.io/not-ready:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: ix-truenas
AcquireTime: <unset>
RenewTime: Wed, 10 Jan 2024 22:08:12 -0500
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Wed, 10 Jan 2024 22:08:13 -0500 Mon, 13 Nov 2023 22:16:40 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 10 Jan 2024 22:08:13 -0500 Mon, 13 Nov 2023 22:16:40 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 10 Jan 2024 22:08:13 -0500 Mon, 13 Nov 2023 22:16:40 -0500 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 10 Jan 2024 22:08:13 -0500 Wed, 10 Jan 2024 22:08:13 -0500 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.1.30
Hostname: ix-truenas
Capacity:
cpu: 12
ephemeral-storage: 27854154Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 65761356Ki
nvidia.com/gpu: 0
pods: 250
Allocatable:
cpu: 12
ephemeral-storage: 27746837493708
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 65761356Ki
nvidia.com/gpu: 0
pods: 250
System Info:
Machine ID: 3226bac618d148519c61c31b083dc929
System UUID: af59a1a8-6f8d-0000-0000-000000000000
Boot ID: e4291094-7048-4ac4-8d8c-595cf703dcc2
Kernel Version: 6.1.63-production+truenas
OS Image: Debian GNU/Linux 12 (bookworm)
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://Unknown
Kubelet Version: v1.26.6+k3s-e18037a7-dirty
Kube-Proxy Version: v1.26.6+k3s-e18037a7-dirty
PodCIDR: 172.16.0.0/16
PodCIDRs: 172.16.0.0/16
Non-terminated Pods: (26 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system nvidia-device-plugin-daemonset-7skx6 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
kube-system csi-nfs-controller-7b74694749-c2dwh 40m (0%) 0 (0%) 80Mi (0%) 900Mi (1%) 34m
kube-system openebs-zfs-node-74wn8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
cert-manager cert-manager-8444f6f86b-bxfww 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
ix-cloudflared cloudflared-5d8bc8d5cd-cnjlg 10m (0%) 4 (33%) 50Mi (0%) 8Gi (12%) 34m
ix-requestrr requestrr-5b94d84495-7s9ql 10m (0%) 4 (33%) 50Mi (0%) 8Gi (12%) 34m
metallb-system speaker-cc9v8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
cnpg-system cnpg-controller-manager-5d74bc79fb-rtq5z 100m (0%) 100m (0%) 100Mi (0%) 200Mi (0%) 34m
kube-system openebs-zfs-controller-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
prometheus-operator prometheus-operator-5dcffb7cb8-vvtdw 100m (0%) 200m (1%) 100Mi (0%) 200Mi (0%) 34m
ix-jackett jackett-bd7f48b58-zcc2q 20m (0%) 8 (66%) 100Mi (0%) 16Gi (25%) 34m
ix-radarr radarr-74588c7f96-nxd96 10m (0%) 4 (33%) 50Mi (0%) 8Gi (12%) 34m
kube-system coredns-59b4f5bbd5-9t7td 100m (0%) 0 (0%) 70Mi (0%) 170Mi (0%) 34m
cert-manager cert-manager-webhook-545bd5d7d8-zlcf7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
ix-qbittorrent qbittorrent-b9686749d-mds8f 20m (0%) 8 (66%) 100Mi (0%) 16Gi (25%) 34m
kube-system csi-nfs-node-xr5r8 30m (0%) 0 (0%) 60Mi (0%) 500Mi (0%) 17m
kube-system csi-smb-controller-7fbbb8fb6f-dvwxb 30m (0%) 2 (16%) 60Mi (0%) 600Mi (0%) 34m
ix-wyoming-piper wyoming-piper-custom-app-7fbbc78649-qbk45 10m (0%) 4 (33%) 50Mi (0%) 8Gi (12%) 34m
kube-system snapshot-controller-546868dfb4-fngtf 10m (0%) 0 (0%) 20Mi (0%) 300Mi (0%) 34m
ix-plex plex-897c9965b-8gz4b 10m (0%) 12 (100%) 50Mi (0%) 8Gi (12%) 34m
kube-system svclb-dizquetv-bb5710f6-56xls 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
kube-system svclb-wyoming-whisper-custom-app-c1cb0c8d-v28lp 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
cert-manager cert-manager-cainjector-ffb4747bb-hbgcn 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
kube-system svclb-frigate-12-custom-app-3a50a40b-2b8bl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
kube-system svclb-plex-955ab32e-z57fq 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
kube-system svclb-wyoming-piper-custom-app-6374b442-7xgdf 0 (0%) 0 (0%) 0 (0%) 0 (0%) 34m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 500m (4%) 46300m (385%)
memory 940Mi (1%) 76598Mi (119%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
nvidia.com/gpu 0 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NodeNotReady 159m kubelet Node ix-truenas status is now: NodeNotReady
Normal Starting 159m kubelet Starting kubelet.
Warning InvalidDiskCapacity 159m kubelet invalid capacity 0 on image filesystem
Normal NodeAllocatableEnforced 159m kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 159m kubelet Node ix-truenas status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 159m kubelet Node ix-truenas status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 159m kubelet Node ix-truenas status is now: NodeHasSufficientPID
Normal NodePasswordValidationComplete 159m k3s-supervisor Deferred node password secret validation complete
Normal RegisteredNode 159m node-controller Node ix-truenas event: Registered Node ix-truenas in Controller
Warning Rebooted 44m (x675 over 159m) kubelet Node ix-truenas has been rebooted, boot id: e4f4f164-f984-4dbe-9073-dfc8c6f74123
Normal NodeHasSufficientPID 34m kubelet Node ix-truenas status is now: NodeHasSufficientPID
Normal Starting 34m kubelet Starting kubelet.
Warning InvalidDiskCapacity 34m kubelet invalid capacity 0 on image filesystem
Normal NodeAllocatableEnforced 34m kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 34m kubelet Node ix-truenas status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 34m kubelet Node ix-truenas status is now: NodeHasNoDiskPressure
Normal NodePasswordValidationComplete 34m k3s-supervisor Deferred node password secret validation complete
Normal RegisteredNode 34m node-controller Node ix-truenas event: Registered Node ix-truenas in Controller
Warning Rebooted 24m (x87 over 34m) kubelet Node ix-truenas has been rebooted, boot id: 0b8e8bd5-0216-4fe3-a25a-c9b6987c96ee
Normal NodeHasNoDiskPressure 17m kubelet Node ix-truenas status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 17m kubelet Node ix-truenas status is now: NodeHasSufficientPID
Normal Starting 17m kubelet Starting kubelet.
Warning InvalidDiskCapacity 17m kubelet invalid capacity 0 on image filesystem
Normal NodeAllocatableEnforced 17m kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 17m kubelet Node ix-truenas status is now: NodeHasSufficientMemory
Normal NodeNotReady 17m kubelet Node ix-truenas status is now: NodeNotReady
Normal NodePasswordValidationComplete 17m k3s-supervisor Deferred node password secret validation complete
Normal RegisteredNode 17m node-controller Node ix-truenas event: Registered Node ix-truenas in Controller
Warning Rebooted 7m44s (x60 over 17m) kubelet Node ix-truenas has been rebooted, boot id: 42e259d9-909a-4479-87c9-d007ab5c42a2
Normal Starting 95s kubelet Starting kubelet.
Warning InvalidDiskCapacity 95s kubelet invalid capacity 0 on image filesystem
Normal NodeAllocatableEnforced 95s kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 95s kubelet Node ix-truenas status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 95s kubelet Node ix-truenas status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 95s kubelet Node ix-truenas status is now: NodeHasSufficientPID
Normal NodeReady 94s kubelet Node ix-truenas status is now: NodeReady
Normal NodePasswordValidationComplete 91s k3s-supervisor Deferred node password secret validation complete
Normal RegisteredNode 85s node-controller Node ix-truenas event: Registered Node ix-truenas in Controller
Warning Rebooted 85s (x18 over 95s) kubelet Node ix-truenas has been rebooted, boot id: e4291094-7048-4ac4-8d8c-595cf703dcc2I have tried restarting multiple times, restoring from a recent config backup, unsetting and re-setting the app pool, and nothing seems to work. I'm at a loss for what to try next. Sometimes it will show App Services started, but spam logs with "failed to connect" when doing the container health checks. It's not super consistent.