Strange connectivity issue

TNWO

Cadet
Joined
Feb 1, 2023
Messages
6
I have two TrueNAS Core servers (13U3.1) on my home network. The main one is physical while the secondary is a VM. I use OPNsense as my router and VPN server (OpenVPN).

The main LAN is 192.168.1.0/24 with the primary server on 192.168.1.21 and secondary .22. They both have multiple interfaces (Management, LAN, and a dedicated for replication). The default GW is set to be on the management interface. The OpenVPN subnet is 10.111.250.0/24 and is routed through OPNsense.

Everything works just fine internally, but this week, while I was travelling I noticed that the SMB shares on the main server were unavailable and the server could not be pinged remotely while connected through my VPN. On the same VPN connection I can ping and connect to the secondary server, as well as to other servers on the same LAN network. Once I got home, I was able to access those same share on the from the LAN as expected.

I quadruple checked the FW rules, OpenVPN config and routing and everything is allowed. I enabled logging on the vpn rules and I can see traffic from the vpn client to the primary server IP on port 445 being passed, as well as icmp.

I never changed anything manually on the TrueNAS servers, only used the GUI. Is there anything on the server itself that would block connection not coming from the same subnet? For curiosity I checked the hosts.deny and hosts.allow and they are empty as expected.

Both TrueNAS servers have the same network configuration and routes, yet I can connect to the secondary, but not the primary for some reason.
 

GBillR

Contributor
Joined
Jun 12, 2016
Messages
189
Smells like a network configuration issue of some type... since the primary is seemingly not responding to pings when you connect via the VPN. Have you tried to remote in to a machine on the same subnet (SSH or RDP) while outside the home and see if the primary NAS shares are available, or if the primary responds to pings that way? Are the multiple interfaces per machine you refer to on different subnets? Multiple interfaces on the same subnet has the potetial to cause issues, and I am pretty sure it is discouraged. Unfortunately, since the secondary machine is a VM, they are not "identical" in network setup, which makes its use as a troubleshooting tool not very helpful.

My advice would be to simplify the network setup as much as possible on the primary machine. Even keep everything on the same switch as well if possible. This could be a switch port setting causing this. Then, see if you can get a ping response across the VPN. Since other machines are already working on the same subnet across the tunnel... I would assume the tunnel setup is proper and that you've configured that routing correctly.
 

TNWO

Cadet
Joined
Feb 1, 2023
Messages
6
Thanks for the reply. Yes, the primary server can be reached and pinged from other devices on my local network (even from different subnets) while I'm connected remotely. I RDP to my desktop through VPN (on the same subnet as TrueNAS btw) and verified that the same shares were accessible.

Yes, all interfaces are on different subnets. All working fine and reachable withing the network, even across subnets, when allowed by the FW rules. For troubleshooting purposes I allowed VPN clients full access and confirmed in the logs that OPNsense is passing the traffic with the primary server as destination.

I originally thought it could be a network mask mismatch somewhere, but I verified I can remotely ping 192.168.1.9 (windows server) and .22 (secondary) but I can't ping .21 (primary server). Whatever is causing this issue is very unusual.
 

GBillR

Contributor
Joined
Jun 12, 2016
Messages
189
I originally thought it could be a network mask mismatch somewhere, but I verified I can remotely ping 192.168.1.9 (windows server) and .22 (secondary) but I can't ping .21 (primary server). Whatever is causing this issue is very unusual.
I agree... that is very strange. Is your VPN client a windows PC? Can you tracerte from that machine to verify the routing? Perhaps that might provide some additional insight? Is it a DNS issue? I assume you are trying to ping the IP address and not the machine name... I have run across issues with windows credentialing that necessitated editing the hosts file on the client before, but you are unable to even ping the NAS... so obviously bigger than that. How about the firewall on the VPN client? Unlikely... since you can see shares on the other NAS and ping that one.

Is this a new problem that just cropped up for you, or is this a new setup?

You seem to have a good grasp of the networking here... I also access TN shares across an OpenVPN tunnel at times, but I do not use my OPNSense to establish the tunnel. Rather, I use a dedicated VPN server on the remote side to handle to connection to the remote subnets.
 

GBillR

Contributor
Joined
Jun 12, 2016
Messages
189
Also - What about the switch hardware? Any port isolation settings that might cause problems across the VPN?
 

TNWO

Cadet
Joined
Feb 1, 2023
Messages
6
Spent the evening testing further. Yes, I use a Win 11 laptop as client, but just to eliminate it as a culprit I tried it with my iPhone: I installed the PhoSync app which is capable of connecting to SMB shares to backup the phone's camera roll. While on my local WiFi I was able to connect to the main server (192.168.1.21) share, browse and transfer files as expected. I then disconnected from WiFi, went on LTE and used the OpenVPN client to connect remotely. I wasn't able to connect to the same SMB share anymore despite seeing the traffic on port 445 being allowed on the FW. I was able to RDP to different machines on my network, including my desktop which is on the same subnet/vlan as the primary TN server. I was also able to connect to the secondary TN server (192.168.1.22).

Second test. I shut down the secondary server (192.168.1.22), changed the IP on the interface on the primary server from 192.168.1.21 (the one I can't connect to or ping) to 192.168.1.22. Tried again to ping and connect through VPN to .22 and again no go. Reverted back to .21 and restarted the secondary TN and was able to connect to .22.

After all these tests I think I can conclude that it's not the client, it's not OpenVPN, it's not the FW or routing. There is something on the primary TN interface that is rejecting connections from VPN clients. I also verified that all the ports on the switch are not using any port isolation or mac address filtering.

I was able to connect using a different interface on the primary server. It's a different VLAN/subnet, so I can use that as a workaround, but I'm still baffled on why the other interface is not working as I it should. Is there any other settings in TN or at the OS level that would prevent or filter connections on a single interface?
 

GBillR

Contributor
Joined
Jun 12, 2016
Messages
189
I was able to connect using a different interface on the primary server. It's a different VLAN/subnet, so I can use that as a workaround, but I'm still baffled on why the other interface is not working as I it should. Is there any other settings in TN or at the OS level that would prevent or filter connections on a single interface?
I assume you have removed and recreated the offending interface on the primary already... have you tried swapping the interfaces to rule out a weird HW issue? I am not aware of any setting that would cause your symptoms... TN has no firewall that I am aware of that could be filtering the traffic... which is the only configurable thing left that I can think of that would be causing similar symptoms.

What does your config look like? Can you provide the output of ifconfig?

Maybe that will trigger someone else to chime in.
 

TNWO

Cadet
Joined
Feb 1, 2023
Messages
6
I solved the issue, but I still do not understand what the problem was. As you suggested, I deleted the 192.168.1.21 interface and recreated it, but it didn't solve the problem. I then tried to recreate it on a different physical interface that was not in use. Sure enough that worked both locally and through VPN! Same IP, same VLAN. I then went and moved (deleted and recreated) the other VLAN interface I had on the original NIC (which worked fine) to the new NIC. Surprise, it did not work even locally.

So at the end I put the LAN interface (192.168.1.21) on the "new" NIC and left the other interface on the original NIC. Both still use their own VLAN tag and both work locally and through VPN. Just for the record the VLAN tagging and settings on the switch were setup correctly.

I'm happy I got it working, but I have no logical explanation for why two VLAN interfaces on one single NIC work perfectly fine when accessed locally, but only one works when accessed through a VPN connection. Then when I split the two VLANS on two separate NICs everything works both locally and remotely. I'm no networking guru, but in my 20+ years working experience, I never came across something so bizarre.

For the record, I also tried different cables during my troubleshooting. All NICS are genuine Intel and use the em driver. Is VLAN implementation in TrueNAS problematic? Is it not recommended?
 

GBillR

Contributor
Joined
Jun 12, 2016
Messages
189
Is VLAN implementation in TrueNAS problematic? Is it not recommended?
Glad you got it sorted out. I personally have not used VLANs on any of my FN or TN interfaces before.. I've always had enough physical NICs and available switchports for my needs, or just used the bridge interface for my VMs or jails and kept them in the same subnet.

Maybe someone who has used VLAN tagging on a TN interface can offer some insight.

If you are still looking to try to troubleshoot that, I would post your ifconfig with the VLAN setup. Maybe someone could try to reproduce the issue.
 

TNWO

Cadet
Joined
Feb 1, 2023
Messages
6
I setup everything the way it was originally and this is the ifconfig output:

Code:
root@truenas[~]# ifconfig
em0: flags=8c22<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER,NOMAP>
        ether 14:da:e9:1d:59:2e
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
em1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
        ether 00:15:17:51:3c:98
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
em2: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
        ether 00:15:17:51:3c:99
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
        groups: pflog
vlan333: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: RSYNC
        options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
        ether 00:15:17:51:3c:99
        inet 10.111.111.2 netmask 0xfffffffc broadcast 10.111.111.3
        groups: vlan
        vlan: 333 vlanproto: 802.1q vlanpcp: 2 parent interface: em2
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
vlan666: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: MAN
        options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
        ether 00:15:17:51:3c:99
        inet 172.16.10.17 netmask 0xffffffe0 broadcast 172.16.10.31
        groups: vlan
        vlan: 666 vlanproto: 802.1q vlanpcp: 0 parent interface: em2
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
vlan700: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
        ether 00:15:17:51:3c:98
        inet 192.168.177.52 netmask 0xffffff00 broadcast 192.168.177.255
        groups: vlan
        vlan: 700 vlanproto: 802.1q vlanpcp: 0 parent interface: em1
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
vlan160: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
        ether 00:15:17:51:3c:98
        inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255
        groups: vlan
        vlan: 160 vlanproto: 802.1q vlanpcp: 0 parent interface: em1
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>


em0 is the onboard Intel NIC that I was not using
em1 and em2 are an Intel dual NIC. em1 is the one giving me problems.
em1.vlan 160 is LAN (192.168.1.21) I can only connect to locally
em1.vlan 700 is MEDIA (192.168.177.52) it serves my Media subnet and I can access both locally and remotely
em2.vlan 666 is the management interface and works fine
em2.vlan 333 is dedicated to RSYNC with for the secondary TN. Works fine.
 

TNWO

Cadet
Joined
Feb 1, 2023
Messages
6
This is the configuration that solves the problem and lets me connect to 192.168.1.21 both locally and remotely

Code:
root@truenas[~]# ifconfig
em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: LAN
        options=481249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER,NOMAP>
        ether 14:da:e9:1d:59:2e
        inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
em1: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: MEDIA
        options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
        ether 00:15:17:51:3c:98
        inet 192.168.177.52 netmask 0xffffff00 broadcast 192.168.177.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
em2: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
        ether 00:15:17:51:3c:99
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
        groups: pflog
vlan333: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: RSYNC
        options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
        ether 00:15:17:51:3c:99
        inet 10.111.111.2 netmask 0xfffffffc broadcast 10.111.111.3
        groups: vlan
        vlan: 333 vlanproto: 802.1q vlanpcp: 2 parent interface: em2
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
vlan666: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: MAN
        options=4000403<RXCSUM,TXCSUM,LRO,NOMAP>
        ether 00:15:17:51:3c:99
        inet 172.16.10.17 netmask 0xffffffe0 broadcast 172.16.10.31
        groups: vlan
        vlan: 666 vlanproto: 802.1q vlanpcp: 0 parent interface: em2
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
 
Top