Pods cannot connect after TrueNAS SCALE enables SR-IOV.

accmo

Cadet
Joined
Apr 28, 2023
Messages
5
I installed TrueNAS in the Proxmox VE 8.0 system as a virtual machine.
Network card: Intel X520 (82599ES)

root@truenas[~]# lspci -nn | grep Eth
01:00.0 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
root@truenas[~]# ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:6f:83:37:9a txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.4.85 netmask 255.255.255.0 broadcast 192.168.4.255
inet6 2408:8207:263b:b4d1:7884:9dff:fe96:5e61 prefixlen 64 scopeid 0x0<global>
inet6 fe80::7884:9dff:fe96:5e61 prefixlen 64 scopeid 0x20<link>
ether 7a:84:9d:96:5e:61 txqueuelen 1000 (Ethernet)
RX packets 389703 bytes 459705201 (438.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 130060 bytes 15352257 (14.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

kube-bridge: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.16.0.1 netmask 255.255.0.0 broadcast 172.16.255.255
inet6 fe80::348f:3aff:fefa:5d40 prefixlen 64 scopeid 0x20<link>
ether 6e:1c:85:f5:c4:1f txqueuelen 1000 (Ethernet)
RX packets 73075 bytes 8438713 (8.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 59881 bytes 11712820 (11.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

kube-dummy-if: flags=195<UP,BROADCAST,RUNNING,NOARP> mtu 1500
inet 172.17.0.1 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::74ba:eeff:fe84:f2f8 prefixlen 64 scopeid 0x20<link>
ether de:41:9d:48:2b:c7 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8 bytes 560 (560.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 48852 bytes 39388684 (37.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 48852 bytes 39388684 (37.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth31dcc515: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::a87b:6aff:fe40:9e9a prefixlen 64 scopeid 0x20<link>
ether ba:01:ad:04:95:f2 txqueuelen 0 (Ethernet)
RX packets 12425 bytes 2599891 (2.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11631 bytes 2245952 (2.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth3fe6b465: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::787c:7fff:fe98:6809 prefixlen 64 scopeid 0x20<link>
ether 7a:7c:7f:98:68:09 txqueuelen 0 (Ethernet)
RX packets 3484 bytes 686976 (670.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4053 bytes 1492480 (1.4 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth6b4cdf27: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::242a:7ff:fe7b:fb36 prefixlen 64 scopeid 0x20<link>
ether 26:2a:07:7b:fb:36 txqueuelen 0 (Ethernet)
RX packets 534 bytes 43230 (42.2 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 606 bytes 224072 (218.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth71a6ed38: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::544a:c0ff:fef8:f2f3 prefixlen 64 scopeid 0x20<link>
ether 56:4a:c0:f8:f2:f3 txqueuelen 0 (Ethernet)
RX packets 1 bytes 42 (42.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 93 bytes 11672 (11.3 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

veth8f81ad78: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::309c:eff:fe78:7f34 prefixlen 64 scopeid 0x20<link>
ether 72:54:ae:fa:2d:68 txqueuelen 0 (Ethernet)
RX packets 128 bytes 11446 (11.1 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 219 bytes 678086 (662.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vethba2ef0da: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::b6:c9ff:fe2f:b547 prefixlen 64 scopeid 0x20<link>
ether 16:d5:00:80:9b:61 txqueuelen 0 (Ethernet)
RX packets 1102 bytes 154887 (151.2 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1162 bytes 499392 (487.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vethc73e8958: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::3c43:eff:fe97:aac2 prefixlen 64 scopeid 0x20<link>
ether 36:49:f2:e5:4d:a6 txqueuelen 0 (Ethernet)
RX packets 1953 bytes 186693 (182.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2067 bytes 203033 (198.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

vethf28b0b46: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::98af:e7ff:fe69:5e62 prefixlen 64 scopeid 0x20<link>
ether d2:e9:fa:41:89:f4 txqueuelen 0 (Ethernet)
RX packets 53453 bytes 5779320 (5.5 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 40717 bytes 6441461 (6.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

I have enabled the SR-IOV of the network card in Proxmox VE, and I can see that the network card used is VF in the command line of the TrueNAS system.But none of my Docker pods can connect to the management computer. Unless I turn off SR-IOV, the pods will never be able to communicate with the router or other devices.
Pods can ping each other, but cannot ping with TrueNAS.
192.168.4.100-192.168.4.110 is pods ip, 192.168.4.85 is TrueNAS ip.
A brief summary of the problems I encountered may be that SR-IOV cannot perform three-layer switching?
root@qbittorrent-ix-chart-7dc4895786-s4ghf:/# ping 192.168.4.85
PING 192.168.4.85 (192.168.4.85): 56 data bytes
^C
--- 192.168.4.85 ping statistics ---
7 packets transmitted, 0 packets received, 100% packet loss
root@qbittorrent-ix-chart-7dc4895786-s4ghf:/# ifconfig
eth0 Link encap:Ethernet HWaddr 4E:E9:BB:E7:78:C1
inet addr:172.16.4.223 Bcast:172.16.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:124424 errors:0 dropped:0 overruns:0 frame:0
TX packets:156930 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:18856471 (17.9 MiB) TX bytes:17659981 (16.8 MiB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:8282 errors:0 dropped:0 overruns:0 frame:0
TX packets:8282 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:794064 (775.4 KiB) TX bytes:794064 (775.4 KiB)

net1 Link encap:Ethernet HWaddr E2:17:91:4E:06:94
inet addr:192.168.4.102 Bcast:192.168.4.255 Mask:255.255.255.0
inet6 addr: 2408:8207:263b:b4d1:e017:91ff:fe4e:694/64 Scope:Global
inet6 addr: 2408:8206:2633:2e30:1d7e:b413:8b51:1f0/64 Scope:Global
inet6 addr: fe80::e017:91ff:fe4e:694/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3134 errors:0 dropped:0 overruns:0 frame:0
TX packets:12412 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:460881 (450.0 KiB) TX bytes:544424 (531.6 KiB)

root@qbittorrent-ix-chart-7dc4895786-s4ghf:/# ping 192.168.4.107
PING 192.168.4.107 (192.168.4.107): 56 data bytes
64 bytes from 192.168.4.107: seq=0 ttl=64 time=0.069 ms
64 bytes from 192.168.4.107: seq=1 ttl=64 time=0.042 ms
64 bytes from 192.168.4.107: seq=2 ttl=64 time=0.043 ms
^C
--- 192.168.4.107 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.042/0.051/0.069 ms
I've searched a lot, but I still can't solve this problem. Hope to get some help here, thanks.
 

accmo

Cadet
Joined
Apr 28, 2023
Messages
5
And I found that pods can access the Internet. I have a pods that automatically download videos. I opened the log and found that it is in normal working condition and can access the Internet server normally.
 

accmo

Cadet
Joined
Apr 28, 2023
Messages
5
I learned that the reason for this problem is: sriov's memory registers need to use direct memory access technology, and sriov's memory registers cannot be read normally inside the container, and data exchange cannot be performed, so it will only be possible to connect to the external network And the problem of being unable to communicate with other devices on the intranet.
Intel also expected this kind of situation to happen, so it specially made the sr-iov adapter plug-in for the k8s cluster. However, TrueNAS prohibits users from making any modifications to the kernel, so currently SR-IOV network card users will never be able to use docker on TrueNAS.
Now, I have changed to the Debian12 system, using the k3d+k3s+calico+muluts-cni+sriov-cni environment, and my problem has been solved so far. Goodbye, TrueNAS.
 
Top