K3S Failing to Start

disconsented

Cadet
Joined
Oct 30, 2023
Messages
1
Hi Folks,

I've been running into the following alert, that, I believe, is preventing me from making use of K3S:

Code:
Failed to start kubernetes cluster for Applications: [EFAULT] Kube-router routes not applied as timed out waiting for pods to execute


Trying to install any app results in the following:

Code:
chart.release.create
Error: [EFAULT] Failed to install chart release: Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition


Having a read through the journalctl logs, this is repeated ad nauseam (with ID variations):

Code:
 
Oct 30 20:34:34 truenas k3s[805789]: E1030 20:34:34.482964  805789 authentication.go:63] "Unable to authenticate the request" err="[invalid bearer token, square/go-jose: error in cryptographic primitive]"
Oct 30 20:34:34 truenas k3s[805789]: 2023-10-30T20:34:34+13:00 [error] Multus: [kube-system/openebs-zfs-controller-0/]: error getting pod: Unauthorized
Oct 30 20:34:34 truenas k3s[805789]: 2023-10-30T20:34:34+13:00 [error] Multus: GetPod failed: Multus: [kube-system/openebs-zfs-controller-0/]: error getting pod: Unauthorized, but continue to delete
Oct 30 20:34:34 truenas k3s[805789]: 2023-10-30T20:34:34+13:00 [error] Multus: failed to get the cached delegates file: open /var/lib/cni/multus/22e0c4aae201f7b95e078515790b66938f9b4c2a387611a854ed1e25e7948860: no such file or directory, cannot properly delete


I'm not sure if this is the same issue, or a cluster of issues. Searching around online, I've seen it commonly suggested that it's either a DNS or NTP issue. timedatectl reports what I'd expect it to, unless there is an issue with second's long lifetime on some token.

Attached are the logs for K3S from journalctl, I'm running TrueNAS-SCALE-22.12.4.2.

Any ideas on what to poke at next?
 

mesquitafmr

Cadet
Joined
Nov 25, 2023
Messages
3
Hey, had the same problem that you had and I was able to fix it. I'm not a programmer so I will just repost what I said on reddit (literal and empirical), hope it works for you.

"(Running TrueNAS Scale Cobia 23.10.0.1)

I had a problem with the k3s cluster not working and haven't found a solution on the web. Mine was: "Failed to start kubernetes cluster for Applications: [EFAULT] Kube-router routes not applied as timed out waiting for pods to execute"

If you have any problem with k3s you should first check on the web if it is a time (NTP server) or DNS problem, because I had those before as well.

If none of the above solves, before trying to reinstall the image, you should check your Snapshots and look for a k3s one and roll back. The apps service may work again.

The tricky part is what happens after. If it is an old snapshot (as was mine) it will roll back to older app versions and older apps installed, but don't worry. The apps folders are still there, so you just have to wait a bit for it to detect the folders, delete the apps that you don't have anymore, and install the apps that you have today. All the apps info are in the apps folders, so if you install it again it will redeploy the app that you had. All the apps that are in an older version in the snapshot will detect the current folders and k3s will update that info and work again.

Last bit: When I rolled back and saw the older apps asking to update, I thought that I fu***d it up and had to reconfigure everything, so I tried to update it all. It didn't work then detected the current app versions in folders. If it doesn't detect the current apps after a time, maybe it will force detection and work."
 
Top