metallb broken?

Gnome

Explorer
Joined
Aug 18, 2011
Messages
87
Hi folks

I'm trying to get metallb working on my NAS.
1695649152829.png

1695649191005.png


Code:
root@nas:~# k3s kubectl get svc -A
NAMESPACE     NAME                          TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
default       kubernetes                    ClusterIP      172.17.0.1      <none>        443/TCP                        22d
kube-system   kube-dns                      ClusterIP      172.17.0.10     <none>        53/UDP,53/TCP,9153/TCP         22d
kube-system   prometheus-operator-kubelet   ClusterIP      None            <none>        10250/TCP,10255/TCP,4194/TCP   21d
ix-metallb    metallb-controllermon         ClusterIP      None            <none>        7472/TCP                       40m
ix-metallb    metallb                       ClusterIP      172.17.30.125   <none>        443/TCP                        40m
ix-metallb    metallb-memberlist            ClusterIP      172.17.14.22    <none>        7946/TCP,7946/UDP              40m
ix-metallb    metallb-speakermon            ClusterIP      None            <none>        7473/TCP                       40m
ix-sonarr     sonarr                        LoadBalancer   172.17.1.148    10.0.0.231    8989/TCP                       39m


When I try to ping 10.0.0.231 from any computer on the network, this is what I see on my NAS:
Code:
root@nas:~# tcpdump -i enp3s0f1 arp
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp3s0f1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
15:41:07.400768 ARP, Request who-has 10.0.0.231 tell router.<domain-name>, length 46
15:41:08.407610 ARP, Request who-has 10.0.0.231 tell router.<domain-name>, length 46
15:41:09.425621 ARP, Request who-has 10.0.0.231 tell router.<domain-name>, length 46
...


And this is what I see on the pfSense (router):
Code:
# tcpdump -i ix0 arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ix0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:44:00.927762 ARP, Request who-has 10.0.0.231 tell router.<domain-name>, length 28
15:44:01.932942 ARP, Request who-has 10.0.0.231 tell router.<domain-name>, length 28



Effectively it seems like the L2 advertisement is not working. The logs from metallb aren't particularly useful, or perhaps I don't understand how to read them.

I've seen a couple of regulars here talk about using metallb, so I'm hoping someone is going to point out some obvious mistake I've made
 
Last edited:

Gnome

Explorer
Joined
Aug 18, 2011
Messages
87
there's 2 apps metallb and metallb-config to choose from. you need both.
See https://truecharts.org/charts/enterprise/metallb-config/setup-guide/ for a setup guide.
I did indeed follow that guide, the screenshots I posted is from the metallb-config package.

I'm sort of at the point where I'd say metallb isn't working. BGP doesn't work because Truenas is running kube-router in BGP mode, so that is out (meaning metallb cannot use port 179 because it is already in use)

And L2 advertisement mode doesn't appear to work either because Truenas is not responding to ARP broadcasts. Or at least the ARP broadcasts or responses are not making it out the network port as I show when capturing raw packets using TCP-dump

As a bit of extra info, I've not only stopped and started everything multiple times, I've even started over completely. No matter what order what I did, from scratch, existing install, etc. it will simply not respond to ARP broadcasts (see the who-has requests above).
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Did you disable the Integrated Load Balancer in Apps | Advanced Settings?
 

Gnome

Explorer
Joined
Aug 18, 2011
Messages
87
Did you disable the Integrated Load Balancer in Apps | Advanced Settings?
I did, I'm fairly certain this also confirms it worked as expected:

Code:
root@nas:~# k3s kubectl get svc -A
NAMESPACE     NAME                          TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
<snip>
ix-sonarr     sonarr                        LoadBalancer   172.17.1.148    10.0.0.231    8989/TCP                       39m


The external IP there matches the range in my configuration. I've got the full output of kubectl in my original post.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I have some distant memory of having to add aliases for the IPs (I don't use MetalLB, but tested it once a long time ago).

It's not in the docs, so I agree should not be needed, but as you mentioned, it seems all the required ports to advertise are occupied by the host, so giving the host the addresses manually may get around that.
 

Gnome

Explorer
Joined
Aug 18, 2011
Messages
87
I have some distant memory of having to add aliases for the IPs (I don't use MetalLB, but tested it once a long time ago).

It's not in the docs, so I agree should not be needed, but as you mentioned, it seems all the required ports to advertise are occupied by the host, so giving the host the addresses manually may get around that.
I'm starting to think that the metallb is not the way to go.
I didn't realise I could add an additional IP to my interface.

Using that piece of information, my thinking is:
  • Add 2 IPs to network interface
  • Use IP1 exclusively for Truenas (bind GUI to IP1)
  • Use IP2 exclusively for Apps (bind Apps to IP2)
Then use a reverse proxy on IP2 and setup entries in my DNS Resolver pointing to that IP for the apps behind that reverse proxy.

Honestly, I can get away with having a separate IP for my Apps. The whole issue for me is having the NAS and Apps on a single IP because then you need to pull all kinds of unnatural hacks to get port 80 and 443 available.

I categorically do not want my TrueNAS behind an App reverse proxy because if that app is down, you suddenly need to access an unnatural port to access the UI (a big no-no for me).

This has happened a couple of times to me over the last few days and it is really frustrating and I can't imagine any serious server admin runs a system like that.

Thanks for this info, let me try this out.
 

Gnome

Explorer
Joined
Aug 18, 2011
Messages
87
I'm concerned about this issue and don't know what the outcome will be.
I got it working but I've opted to not use MetalLB.
The whole L2 mechanism is just very icky if you care about security and visibility on what's happening on your network.
The L3/BGP mechanism doesn't work because Kubernetes is already running its own BGP on that port and Truenas doesn't provide a way to turn that off.

---

I used the mechanism I suggested above, I'm running my UI on 192.168.0.250 and my "Apps" on 192.168.0.240.
I'm using "Traefik" as reverse proxy.

The UI running on 192.168.0.250 is using port 80 & 443 respectively
Traefik running on 192.168.0.240 is also using port 80 & 443 respectively.
I'm using ClusterIP (Do not expose ports) except for Unifi (required for it to work) and qBittorrent (TCP/UDP port for torrents).

This allows me to avoid the absolutely trash advise of running the UI on a non-standard port.
It's trash advise because if your Apps stop working, your UI is no longer accessible except through non-standard ports.

As an aside, this required that I first set the UI to use the IP.
Then install Traefik.
Done the other way around the UI complains about Traefik running on port 80/443 regardless if it is binding to another IP ¯\_(ツ)_/¯
 
Last edited:

Flachzange

Dabbler
Joined
Jul 10, 2022
Messages
16
I can confirm the issue with metallb. With Bluefin it helped to ativate and deactivate the internal load balancer. Yesterday I had to update to Cobia and nothing worked again. It drove me nuts. Surprisingly, ACTIVATING the internal load balancer works now for me. After a restart I also had to restart the metallb app. This can be easily observed by doing a continuous ping on of the IP addresses.

My issue is: I have containers that need to be accessible on their native ports and I also believe they do not work through traefik. At least this would be similiar painful.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
The Metallb app instructions pretty clearly tell you that you can't use the built-in load balancer at the same time and must disable it...
 

Flachzange

Dabbler
Joined
Jul 10, 2022
Messages
16
Maybe I was not clear enough on this. Sorry. I know that the internal load balancer must be deactivated but as described above: after the update to Cobia it only works (as before and as expected) when the internal load balancer is activated. Deactivating the internal load balance results in the situation as described by @Gnome
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
OK, good feedback to give to the folks over at TrueCharts then... unfortunately they don't look in the forums here.
 

Flachzange

Dabbler
Joined
Jul 10, 2022
Messages
16
I know and I will follow-up with them on discord, I was just happy to get it sorted for now
 
Top