App Service fails to lookup bridge interface during boot after upgrade to TNS 23.10.1

Joined
Feb 6, 2024
Messages
2
I had the same problem. Originally, I thought this was because of the two IP addresses on the br0 interface, as I posted here: https://www.truenas.com/community/threads/kubernetes-service-is-not-running.107377/post-797886

Since the problem according to this post is due to k3s service starting before the bridge interface is up, I created an override for the k3s service: /etc/systemd/system/k3s.service.d/override.conf:

Code:
[Unit]
After=network-online.target


This was not sufficient and I actually had to specify the bridge interface (br0) in my case

Code:
[Unit]
After=network.target
Requires=br0.device


I still had to do the Toggle GPU thing, because the interface does not reflect the actual status of the k3s service.

After the reboot, the k3s is running without issues.
This solution is not working.

Added the directory `/etc/systemd/system/k3s.service.d` which did not exist prior
Created the file `/etc/systemd/system/k3s.service.d/override.conf` with the content

```
[Unit]
After=network.target
Requires=br0.device
```

ran the following commands

```
systemctl daemon-reload
systemctl restart k3s
```

Rebooted and the still the same result
 
Joined
Feb 6, 2024
Messages
2
So it appears for now, the Kubernetes distribution when running with a bridge network adapter is not a reliable option till the issue is resolved.

This bridge adapter is needed so you can routes between the truenas host and VMs and host and apps.
 

jovermier

Cadet
Joined
Feb 8, 2024
Messages
1
I was able to get my apps running by setting the Isolated GPU Devices. I selected the Intel Arc A380 and left the integrated AMD graphics device unselected. The A380 should be supported in kernel 6.1. Seems like there are still issues.
 

GyulaMasa

Dabbler
Joined
Aug 6, 2023
Messages
18
So it appears for now, the Kubernetes distribution when running with a bridge network adapter is not a reliable option till the issue is resolved.

This bridge adapter is needed so you can routes between the truenas host and VMs and host and apps.
All this makes me more-and-more angry.
I installed 22.12.3.3 system for my remote backup system and it worked really fine.
So I decided to migrate my so far working LInux based ZFS file server to TrueNAS.
And ever since I encounter annoying, unsolved bug after bug.
Of course, returning to my Linux based solution is not any more possible, since I was silly enough to "Upgrade" my array so the Linux ZFS implementation can not import it any more.
 

fa2k

Dabbler
Joined
Jan 9, 2022
Messages
34
The bug seems to mention that the bridge interface is improperly added to "the database". A comment proposes a solution to remove the bridge interface and recreate it.
https://ixsystems.atlassian.net/browse/NAS-125932
As MainCranium asked, I'm also curious if anyone was able to fix the issue by performing those.

Regardless of whether that works, I'm worried that removing my main interface br0 will mess up a lot of things, like all VMs with configured NICs and other configuration. Is anyone here familiar enough with the database to suggest a way to manually remove br0 from the database? I know this can come with huge risks, so please reiterate that, if anyone does post how to do that. For me, I'm comfortable risking to rebuild the NAS if there's a chance I can skip removing br0 and messing with all the network config.
 

GyulaMasa

Dabbler
Joined
Aug 6, 2023
Messages
18
The bug seems to mention that the bridge interface is improperly added to "the database". A comment proposes a solution to remove the bridge interface and recreate it.
https://ixsystems.atlassian.net/browse/NAS-125932
As MainCranium asked, I'm also curious if anyone was able to fix the issue by performing those.

Regardless of whether that works, I'm worried that removing my main interface br0 will mess up a lot of things, like all VMs with configured NICs and other configuration. Is anyone here familiar enough with the database to suggest a way to manually remove br0 from the database? I know this can come with huge risks, so please reiterate that, if anyone does post how to do that. For me, I'm comfortable risking to rebuild the NAS if there's a chance I can skip removing br0 and messing with all the network config.
For me it looks like it works.
It was a fiddling and nothing helped in the evening.
But when I returned in the morning all apps were running, seemingly normal.
TOmorrow, o on Friday I will check some apps, if the actually work, or just are sitting there as a decoration.
 

fa2k

Dabbler
Joined
Jan 9, 2022
Messages
34
I bit the bullet and tried to delete the bridge [edit to add: no pain at all - all my VMs etc kept the config after I added it back later]. It didn't solve the problem, I still get the error about starting Kubernetes service on boot.

Procedure tested and failed:
* Configure an alternative non-bridge interface
* Configure Kubernetes to use new interface
* Delete main interface br0
* Reboot
* Configure br0 again. It has two NICs and two IPv4 addresses in my setup.
* Configure Kubernetes to use br0 again
* Reboot
* Kubernetes fails to start
 

fa2k

Dabbler
Joined
Jan 9, 2022
Messages
34
Thanks for the report GyulaMasa - good to hear it seems to work. I have one more idea for me - I can probably live without a bridge, so I can try to change to a bare interface. But can't have more downtime now, will test it later.
 

phyco1991

Cadet
Joined
Feb 22, 2024
Messages
1
waterfox_2024-02-22_23-27-09.png
This still doesn't appear to have been resolved in 23.10.2 unfortunately :(

I tried the same steps as above prior to upgrading as well...
 

fa2k

Dabbler
Joined
Jan 9, 2022
Messages
34
I took some more downtime and removed the bridge (also updated). It did not fix the problem! I still get the error on reboot.
I'm so bothered about this because I need an app to remote in and unlock the encrypted volumes and start other services in case there's a power outage. Seems I just have to swtich to a VM or another host for that.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
I tried updating to Cobia (.2) finally thinking all key bugs were fixed and then ran into this one. I need to delete the bridge due to the bug with bridging not working after the update, the suggested solution was to delete the bridge interface and then readd it. I can't delete it so I can't fix the issue and the machine does not come up normally afterwards otherwise.

Trying to delete the bridge gives this:


I guess I can start with a clean config but way more work than I wanted to do this weekend, so, reverting for now. But it still needs fixed.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
I did take some time and added a nic so I could create a second bridge so I could remove the original bridge. Then, I recreated the original bridge and deleted the temporary bridge and removed the nic. So, I deleted and re-created the bridge iow. And sure enough, a reboot and, it still fails. Recreating the bridge did not help in my case.

As a workaround, I added cli commands to my truenas startup script to change the gpu setting as noted earlier in this topic. So, at least reboots work now.
 
Last edited:

solit

Cadet
Joined
Feb 25, 2024
Messages
4
I did take some time and added a nic so I could create a second bridge so I could remove the original bridge. Then, I recreated the original bridge and deleted the temporary bridge and removed the nic. So, I deleted and re-created the bridge iow. And sure enough, a reboot and, it still fails. Recreating the bridge did not help in my case.

As a workaround, I added cli commands to my truenas startup script to change the gpu setting as noted earlier in this topic. So, at least reboots work now.
Just a reminder to anyone struggling with this that the Apps page does not seem to automatically update the K8 status or the status of deployed apps in the Web UI (on my instance of TrueNAS-SCALE-23.10.2).
After fiddling with the bridge setting and (after reading the post above) deactivating the GPU support nothing seemed to have changed. But after forcing a web UI reload (CTRL + R), it showed me that actually this config change seemed to have solved my problem and all apps were deploying as desired. Hopefully this helps someone else as well :)
 

abackhaus

Cadet
Joined
Feb 8, 2024
Messages
1
Having the same issue with the TrueNAS-SCALE-23.10.2 without having a bridge configured.
Every reboot ends with:

Failed to configure kubernetes cluster for Applications: Unable to lookup configured interfaces: enp2s0

Unsetting pool and then adding pool again temporarily solves this issue as well as enabling/disabling GPU support. But at next reboot the issue is back again ...

This is really annoying and the issue exists for quite a while now ..
 

GyulaMasa

Dabbler
Joined
Aug 6, 2023
Messages
18
Thanks for the report GyulaMasa - good to hear it seems to work. I have one more idea for me - I can probably live without a bridge, so I can try to change to a bare interface. But can't have more downtime now, will test it later.
Yes, now it works.
I did the additional config file for K3s that forces the app to wait until br0 is up, but still showed unavailable apps after a restart.
Using the "support GPU" checkbox unticking and ticking really helps re-initiating k3s. But you need to wait for that. As I mentioned, it took me sometime the whole night to recover, it was showing not available for an hour I think.
 

GyulaMasa

Dabbler
Joined
Aug 6, 2023
Messages
18
OK, just some statistics:
My currently installed Apps:
- Home-assistant (stacked in a "Deploying" state for ever, homepage not reachable)
- Home Assistant configurator (same as above)
- NetbootXYZ ( Shows "Running", and webpage is also available)
- Octopi (Shows "Running" and webpage is available)
- Plex (SHows "Running" but webpage times out)
- Webnut (Shows "running" and webpage is also up)
- wg-easy (Wireguard frontend) (Shows "Running" and webpage is available)

COnclusion:
- This is a fresh install of TrueNAS Scale on this new system.
- Home assistant is not essential, only installed to play with it and buy compatible devices later
- Plex is not working for some reason, but I have a lot of problems with that otherwise too...
- ALl other apps seem to work without a problem
- (I am not able to create a VPN for weeks now, but it must be my fault! And this does not belong to this topic.)

I did all steps from this comment:
(I am not sure, whether it helped or not or I just had luck anyways)

I had the same problem. Originally, I thought this was because of the two IP addresses on the br0 interface, as I posted here: https://www.truenas.com/community/threads/kubernetes-service-is-not-running.107377/post-797886

Since the problem according to this post is due to k3s service starting before the bridge interface is up, I created an override for the k3s service: /etc/systemd/system/k3s.service.d/override.conf:

Code:
[Unit]
After=network-online.target


This was not sufficient and I actually had to specify the bridge interface (br0) in my case

Code:
[Unit]
After=network.target
Requires=br0.device


I still had to do the Toggle GPU thing, because the interface does not reflect the actual status of the k3s service.

After the reboot, the k3s is running without issues.
 

bingo1105

Cadet
Joined
May 4, 2022
Messages
8
I couldn't help myself and tried the suggested fix, but deleting the bridge / rebooting / reconfiguring the bridge definitely doesn't resolve the problem. App service still fails at startup, unable to find bridge interface.
 

bingo1105

Cadet
Joined
May 4, 2022
Messages
8
Is there even an active bug report on this? The only bug report links in this thread are closed, and I can't find anything in jira suggesting that this an issue under active review. This seems like a fairly significant shortcoming given that bridge interfaces are a necessity in many environments.
 

PyCoder

Dabbler
Joined
Nov 5, 2019
Messages
30
Is there even an active bug report on this? The only bug report links in this thread are closed, and I can't find anything in jira suggesting that this an issue under active review. This seems like a fairly significant shortcoming given that bridge interfaces are a necessity in many environments.


But they closed it with a "hotfix" for 23.10.1.1, which doesn't work.

I'm on TrueNAS-SCALE-23.10.2 and it still exists and others confirmed it here.
 
Top