Accessing NAS from VM Still Broken without Bridge

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Hi All,

I just wanted to circle back to this since I haven't seen any update on this topic in a while. Is there an actual ticket number I can reference @morganL ?

Tom Lawrence (@LawrenceSystems on here) Brought this up in his video on Bluefin today (and in previous videos):

And Wendell from Level1Techs brought this up back like 4 months ago:

Basically, I am just complaining that VMs can't access the NAS they are running on without a network bridge (most folks followed this: https://www.truenas.com/docs/scale/scaletutorials/virtualization/accessingnasfromvm/) being manually created, or unless they go out to the switch and back over a different VLAN.

In my configuration I am doing the latter currently, which I think is totally silly. Looking at the past 24 hours, 5 out of 6 of the top addresses talking THROUGH my switch back to my NAS are VMs running on my NAS.
1671158886270.png


If I expand the resolution out to a month, my Plex server literally used a Terrabyte of network bandwidth to talk to the server it lives on.....

1671159148695.png

What's going on? Is this really not something folks other than me and a few tech YouTubers care about?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Basically, I am just complaining that VMs can't access the NAS they are running on without a network bridge (most folks followed this: https://www.truenas.com/docs/scale/scaletutorials/virtualization/accessingnasfromvm/) being manually created, or unless they go out to the switch and back over a different VLAN.

Yes, because to avoid this, the system IP needs to be configured on the bridge interface, and this is fiddly to set up. It is also possible to go out to the upstream switch and in the other interface (no VLAN is needed), which means that all your VM traffic is isolated to em1 (or whatever) while the NAS system uses em0 (or whatever), with the NAS IP address answering via em0.

This is inherent to the design of the networking stack. At best, developers could maybe implement a setup helper that would deprovision the main ethernet, bring up a bridge, add the main ethernet interface back to the bridge as a member, and then reprovision the bridge with the IP configuration that used to exist on the main ethernet. However, this is vaguely tricky because routes flush when an interface configuration is changed.

In my configuration I am doing the latter currently, which I think is totally silly. Looking at the past 24 hours, 5 out of 6 of the top addresses talking THROUGH my switch back to my NAS are VMs running on my NAS.

You're certainly allowed to think it is totally silly. However, I'm curious, if you think this, then why don't you fix it?

It is not possible for traffic to short circuit from a VM directly to the host without a bridge. The bridge is the networking abstraction which controls the proper layer two propagation of virtual ethernet traffic. It is a virtual ethernet switch. Imagine you had three (or more) desktop computers that you wanted to connect to your NAS. Now imagine that you complained that you couldn't attach them all and that someone insisted that you had to use an ethernet switch. Sounds a little silly, right?

If I expand the resolution out to a month, my Plex server literally used a Terrabyte of network bandwidth to talk to the server it lives on.....

Yes. So? You need something to let them communicate. Even if you had a bridge, it would use the same terabyte of network bandwidth, it would just be localized to the bridge (virtual ethernet switch) and would thus avoid traversing a physical interface.

If you want it to be virtual, by all means, go ahead and make it virtual.

What's going on? Is this really not something folks other than me and a few tech YouTubers care about?

If you ask me, as someone who does infrastructure professionally, it seems like some "tech" YouTubers are kvetching about some perceived deficiency in search of views.

IP networking is complicated. Water is wet. Those of us who care will actually take the time to properly design our systems, physical layer switches, and virtual layer switches to accomplish the desired goals. This requires some time and effort. I would absolutely NOT want design dictated to me by TrueNAS developers, because I might well want the freedom to do something substantially more clever than they could imagine.

I would be fine with them making a helper tool to help set up a basic bridging setup though. This is REALLY hard for many beginners, especially those that don't grasp the concepts.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Mr Greco, As always I appreciate your help, and the resources you've contributed to the forum over the years. First, just so we're on the same page, I'm in charge of the IT infrastructure for a relatively large organization. While I may be a schmuck, I do know a thing or two and while I certainly don't bring your depth of knowledge I don't think I have a misunderstanding of how networking works.

I totally understand "how" and "why" SCALE is doing what it's doing. I just think that leaving the defaults in such a way that the user needs to take additional action for such a fundamentally basic and common use-case is an insane design choice. The reason folks use TrueNAS in the days of OpenZFS/ZOL is because it's easy to use. Ubuntu with something like Cockpit or Houstin, or Proxmox, with Portainer for docker, all can do what TrueNAS can do. But people, myself included, like TrueNAS because of the simplicity it brings.

Yes, because to avoid this, the system IP needs to be configured on the bridge interface, and this is fiddly to set up. It is also possible to go out to the upstream switch and in the other interface (no VLAN is needed), which means that all your VM traffic is isolated to em1 (or whatever) while the NAS system uses em0 (or whatever), with the NAS IP address answering via em0.
Whether its a seperate physical interface or a VLAN, the situation is the same. Why would you purposely design a network where packets leave a host and go upstream to return back to the same host?

This is inherent to the design of the networking stack. At best, developers could maybe implement a setup helper that would deprovision the main ethernet, bring up a bridge, add the main ethernet interface back to the bridge as a member, and then reprovision the bridge with the IP configuration that used to exist on the main ethernet. However, this is vaguely tricky because routes flush when an interface configuration is changed.
They don't need to make a setup helper, they should just automatically generate the bridge interface ontop of your systems main interface. Maybe a checkbox is all that would be required.

Let me point you to the source of my awareness to this, because I honestly didn't even bother to check that my server was doing this crazy loop out and back in again until after I saw the video form Level1Techs. I've just been too lazy to fix it, because in my HomeLab environment it really doesn't matter, my server and my network are not all that busy.

In any case, from a user perspective for folks who don't know what they are doing, their expected work around is insane. Some of this may no longer apply in BlueFin?, but in any case it's crazy. From Level1Techs guide:

Oh but wait​

If you dig into what I’m about to you show you, you will see that there are probably a few hundred posts on the internet with people experiencing various versions of this issue with TureNAS scale for the past 2-3 years. So this is part of why I broke the how-to out into its own video.

So if you ping your TrueNAS host, 192.168.1.1 for example, from the VM. It doesn’t work. What on earth?

image


It is important for me to help you understand how it is failing so you can a) not be frustrated and b) level up your troubleshooting skills when encountering problems adjacent to this problem set.

This whole class of problems stems from having a poor mental model of host networking, virtualization and the glue that binds all that together. Without someone revealing the inner workings it can be quite difficult to intuit what on earth could be wrong.

Yes, this absolutely insane default has been gatekeeping people for two years and counting that just want to do what I’m trying to do for years and the fix seems not to be forthcoming.

It’s actually impressive that it is broken this way because debian + virtualization a la KVM out of the box does not come this broken since circa 2008 (when containers were starting to come in vogue). You have to work to set this up in a broken way. I think the architects need a bit more Linux experience or misunderstand how the out-of-box defaults should be.

Users should not really have to deal with this.

Not to worry, we’ll fix it.​

What we have to do is create a bridge and then assign our physical ethernet interface to the bridge. The bridge will automatically attach to the virtual network stack that’s on the VM we previously created. That’ll let us mix our real and virtual interfaces.



IMG_20220710_001124_Bokeh
IMG_20220710_001124_Bokeh1920×1440 268 KB


Note: If you have multiple nics, as on the dual nic Supermicro system from Microcenter, it is totally fine to use one nic for the “host” (not in a bridge) and an IP-less interface in the bridge. What is shown here works fine for single-nic setups, though.

The next thing to be aware of is the web gui for networking/bridge for this is a buggy piece of crap. Don’t bother. Use the console.



IMG_20220710_001347
IMG_20220710_0013471920×1440 149 KB


In setting this video up I went through 5 systems total (the 3 in the video, a dual 2011 system, etc. and they all had various problems getting the bridge setup.

The most reliable method is to use the CLI/console to set it up.

Step 1, disable DHCP on the primary nic.

Step 2, create a new interface with N, bridge, and set the alias to 192.168.1.1/24 (or whatever is appropriate for your network).

Step 3. Apply the changes, and hit p to persist.



IMG_20220710_001150
IMG_20220710_0011501920×1440 402 KB


There is some bug here where the bridged interface never actually enters the forwarding state. Even if the web gui works properly, it will time out and revert the changes.

So, once you “P” to persist the changes, reboot.

You should, hopefully, see “(interface name) entered the forwarding state” after you reboot. Drop to a linux shell and ping other things on your network.



IMG_20220710_001320_Bokeh
IMG_20220710_001320_Bokeh1920×1440 185 KB


It’s working! And it was only mildly horribly painful for no reason!



IMG_20220710_001611
IMG_20220710_0016111920×1440 123 KB

You're certainly allowed to think it is totally silly. However, I'm curious, if you think this, then why don't you fix it?

I can implement systems, design network topologies, mitigate security risks, manage dozens of unrelated and yet interconnected systems, manage thousands of users and do all sorts of relatively high level things. But I am, above all else, most certainly not a coder. I can barely read Python, let alone contribute to the design of the middleware or web frontend.

It is not possible for traffic to short circuit from a VM directly to the host without a bridge. The bridge is the networking abstraction which controls the proper layer two propagation of virtual ethernet traffic.
In hind sight, I think the language I used in my OP was unclear. I am not suggesting that magic voodoo happens and the packets get to where they are going without a bridge, I am just saying that the fact it has to be handled by the user is the problem.

The bridge is the networking abstraction which controls the proper layer two propagation of virtual ethernet traffic. It is a virtual ethernet switch. Imagine you had three (or more) desktop computers that you wanted to connect to your NAS. Now imagine that you complained that you couldn't attach them all and that someone insisted that you had to use an ethernet switch. Sounds a little silly, right?
It's not even that sophisticated. It's a hub. There's no logic to a bridge, it's not a VMWare Distributed switch or a Nexus 9000v.

To take your example to what I am saying, I am in the room with those 3 computers, and I have a dummy switch in my hand. On the floor there is a box of ethernet wire, a crimper and some RJ45 ends. I can make all of the wires and connect all of the things. But why should I have to when pre-terminated patch cables exist?

Yes. So? You need something to let them communicate. Even if you had a bridge, it would use the same terabyte of network bandwidth, it would just be localized to the bridge (virtual ethernet switch) and would thus avoid traversing a physical interface.

If you want it to be virtual, by all means, go ahead and make it virtual.
Sure, but this is exactly like having a collapsed core for your network topology when you have all of the equipment to build out a proper 3-tier core, distribution and access layer network. I mean hell, at work I am routing all the way out in my IDF closets so I don't have to deal with layer-2 east-west traffic traversing all the way up to the distro in my buildings and spanning-tree problems.

Systems should be kept isolated and as close to where they need to go as possible, variables exist the further out you go that you might not consider.

If you ask me, as someone who does infrastructure professionally, it seems like some "tech" YouTubers are kvetching about some perceived deficiency in search of views.
I don't think that's really the case here, but we can agree to disagree on that point.

IP networking is complicated. Water is wet. Those of us who care will actually take the time to properly design our systems, physical layer switches, and virtual layer switches to accomplish the desired goals. This requires some time and effort. I would absolutely NOT want design dictated to me by TrueNAS developers, because I might well want the freedom to do something substantially more clever than they could imagine.

I would be fine with them making a helper tool to help set up a basic bridging setup though. This is REALLY hard for many beginners, especially those that don't grasp the concepts.

I think you and I are of a similar mindset in that regard. But really, all we should need is a checkbox on the interface screen that makes a bridge for you :) Maybe it should even be enabled by default :P

Maybe one of these days I'll play around with using a virtualized TNSR router to do some cool things with VMs under SCALE.
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Mr Greco, As always I appreciate your help, and the resources you've contributed to the forum over the years. First, just so we're on the same page, I'm in charge of the IT infrastructure for a relatively large organization. While I may be a schmuck, I do know a thing or two and while I certainly don't bring your depth of knowledge I don't think I have a misunderstanding of how networking works.

I totally understand "how" and "why" SCALE is doing what it's doing. I just think that leaving the defaults in such a way that the user needs to take additional action for such a fundamentally basic and common use-case is an insane design choice. The reason folks use TrueNAS in the days of OpenZFS/ZOL is because it's easy to use. Ubuntu with something like Cockpit or Houstin, or Proxmox, with Portainer for docker, all can do what TrueNAS can do. But people, myself included, like TrueNAS because of the simplicity it brings.


Whether its a seperate physical interface or a VLAN, the situation is the same. Why would you purposely design a network where packets leave a host and go upstream to return back to the same host?


They don't need to make a setup helper, they should just automatically generate the bridge interface ontop of your systems main interface. Maybe a checkbox is all that would be required.

Let me point you to the source of my awareness to this, because I honestly didn't even bother to check that my server was doing this crazy loop out and back in again until after I saw the video form Level1Techs. I've just been too lazy to fix it, because in my HomeLab environment it really doesn't matter, my server and my network are not all that busy.

In any case, from a user perspective for folks who don't know what they are doing, their expected work around is insane. Some of this may no longer apply in BlueFin?, but in any case it's crazy. From Level1Techs guide:




I can implement systems, design network topologies, mitigate security risks, manage dozens of unrelated and yet interconnected systems, manage thousands of users and do all sorts of relatively high level things. But I am, above all else, most certainly not a coder. I can barely read Python, let alone contribute to the design of the middleware or web frontend.


In hind sight, I think the language I used in my OP was unclear. I am not suggesting that magic voodoo happens and the packets get to where they are going without a bridge, I am just saying that the fact it has to be handled by the user is the problem.


It's not even that sophisticated. It's a hub. There's no logic to a bridge, it's not a VMWare Distributed switch or a Nexus 9000v.

To take your example to what I am saying, I am in the room with those 3 computers, and I have a dummy switch in my hand. On the floor there is a box of ethernet wire, a crimper and some RJ45 ends. I can make all of the wires and connect all of the things. But why should I have to when pre-terminated patch cables exist?


Sure, but this is exactly like having a collapsed core for your network topology when you have all of the equipment to build out a proper 3-tier core, distribution and access layer network. I mean hell, at work I am routing all the way out in my IDF closets so I don't have to deal with layer-2 east-west traffic traversing all the way up to the distro in my buildings and spanning-tree problems.

Systems should be kept isolated and as close to where they need to go as possible, variables exist the further out you go that you might not consider.


I don't think that's really the case here, but we can agree to disagree on that point.



I think you and I are of a similar mindset in that regard. But really, all we should need is a checkbox on the interface screen that makes a bridge for you :) Maybe it should even be enabled by default :P


Appreciate @jgreco 's defense of what we do, but agree with @NickF that we should ideally make it easier. The complexity comes from the bridge only being needed in some cases and in other cases, multiple bridges being required.

@NickF feel free to make a new suggestion with this info. If you provide the ticket number, others can upvote.

I was at Cisco for 13 years.. networking is getting more complex rather than simpler. If there any any experienced lInux networking gurus, we'd love to have them join the team and help resolve these types of issues.

If there are
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I just think that leaving the defaults in such a way that the user needs to take additional action for such a fundamentally basic and common use-case is an insane design choice.
They don't need to make a setup helper, they should just automatically generate the bridge interface ontop of your systems main interface.

This just causes other problems, not the least of which is that adding a bridge in there (especially if we argue for it by default) adds a performance penalty. iX has historically been hella-resistant to doing anything that causes performance to suffer. I have to imagine that this is due to the need for the product to benchmark well, a line of reasoning once put forth by the dev team.

Additionally, the idea that your system has a "main interface" is fundamentally broken. Many of us do not have any such thing. It's convenient to forget that there are use cases other than SOHO/hobbyist NAS. But in a situation where, for example, a LAGG failover configuration is the primary interface, you're already partway down the rabbit hole. And what about those of us who work in a VLAN-heavy setup, where subinterfaces serving dedicated storage networks are used?

It was hard enough to get TrueNAS developers to get all the setup for VLANs and LAGG's right in the first place. I was one of the people that pressed on this over the years. I didn't argue for any of those fancier options to be enabled by default, because even though I think of them as common scenarios, I know that everyone has a different layout for their network. Making a default bridge would be messy and, in many cases, incorrect. What happens when you have multiple networks connecting to your virtual machines, and need multiple bridges?

Why would you purposely design a network where packets leave a host and go upstream to return back to the same host?

That's a question for your network engineer, just as "why would you force data to be copied from one dataset to another to happen over the network" is a question for your storage engineer. It's a matter of how you chose to set things up. Setting up fewer datasets may allow you to simply move or rename files, rather than forcing a network traversal via copy. Really very similar problem in a way. It requires you to do more thinking about your setup.

I can implement systems, design network topologies, mitigate security risks, manage dozens of unrelated and yet interconnected systems, manage thousands of users and do all sorts of relatively high level things. But I am, above all else, most certainly not a coder. I can barely read Python, let alone contribute to the design of the middleware or web frontend.

When I say "fix it", I'm not talking coding. I'm talking network engineering. Design your network environment to do the thing you want. This is like on ethernet switches where they do not come preconfigured to do the specific thing you wish they would do automatically. I don't think it unreasonable to be expected to take some time to step through a configuration process. For TrueNAS, if I needed a performance-hurting bridge, I would expect needing to set that up manually, perhaps with a helper tool of some sort.

In hind sight, I think the language I used in my OP was unclear. I am not suggesting that magic voodoo happens and the packets get to where they are going without a bridge, I am just saying that the fact it has to be handled by the user is the problem.

Fine, but the whole problem of setting things up to begin with is already shoveled upon the user. If you really want a simple NAS, the Synology and QNAP units are --->> thataway. If you want a NAS capable of incredibly sophisticated networking, then TrueNAS is your beast. But you also have to recognize that YOUR preferences for "incredibly sophisticated networking" are not likely to be the things that *I* need for networking.

Please consider that, as an example, the networking environment here at this one point of presence has ... 120 virtual lans, it seems, as of today, and that a lot of these would need to have bridges if I were to use the virtualization features on TrueNAS. Try to imagine what sort of setup would be needed for this, and then try to imagine how that would be configured in the NAS. Or, worse, on each of the 5 NAS's serving the site. The complexity level shoots up rapidly if you insist on default bridges on every potential IPv4 network, just sayin'.

It's not even that sophisticated. It's a hub. There's no logic to a bridge, it's not a VMWare Distributed switch or a Nexus 9000v.

Well, this is demonstrably untrue; FreeBSD certainly does switch-like MAC filtering that only sends appropriate traffic to a virtual interface, and also gained STP/RSTP support some time ago, and in fact for FreeBSD 13 some significant development went in to clear up a performance-hurting mutex issue. It is absolutely not just "a hub". I believe the Linux stuff is probably less sophisticated, for reasons that I don't really understand. Maybe theirs is "a hub".

To take your example to what I am saying, I am in the room with those 3 computers, and I have a dummy switch in my hand. On the floor there is a box of ethernet wire, a crimper and some RJ45 ends. I can make all of the wires and connect all of the things. But why should I have to when pre-terminated patch cables exist?

I don't see how that relates. You had a question that essentially was asking why a bridge (and let's just please agree to call it functionally a virtual switch) was needed. The point is that the switch is needed unless you're going to do a pointopoint connection between two of the hosts. One possible model has the traffic egress the host to a physical switch before returning to the host, which is simpler but also kinda dumb in a way, while another has the traffic traversing a built-in virtual switch. Both have pros and cons.

But really, all we should need is a checkbox on the interface screen that makes a bridge for you :)

I'm fine with that. That's basically one method of implementing what I was talking about a setup helper tool. On the other hand, I watched the development team struggle with implementing both LAGG's and VLAN's back in the day, and it caused me to give a great deal of thought to how difficult it really is to do these reconfigurations on the fly, especially with the unwind support to back out a change.

I just don't think it is desirable to have this done "by default". Some people have argued that it is an error for the NAS to ship with all services set to "OFF", but I don't think so. I still have one of RedHat's earliest CD's around, where they shipped with virtually every service coming out of the box in a ""usable"" format (which meant turned on and configured wide open), and for years I used that as a teaching lesson for padawans learning about UNIX and network security.

I see the role of TrueNAS as making it easier to set up a NAS, but that does not have to mean that it should roll onto the floor and try to be able to do everything it is capable of. Some configuration being required is not a bad thing. It is just a matter of how much is enough, I think.

Maybe it should even be enabled by default :P

I disagree, because it is just an ill-advised default misconfiguration that would get in the way of whatever I was really trying to do.

In any case, a pleasant discussion for an insomniac evening. Thank you. :smile:
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The complexity comes from the bridge only being needed in some cases and in other cases, multiple bridges being required.

Having watched the poopstorm that surrounded handling this for VLAN and LAGG (on FreeNAS years ago), I don't think I'm going to hold my breath. If you go anywhere with this, I don't think it will be practical to build a comprehensive solution to do this via the GUI. As a practical example, I've watched Ubiquiti and their screwed up version of VyOS evolve over the years, and when it gets to a certain level of complexity, they just don't support it, and you have to go into the CLI instead. Even relatively simple tasks such as setting up a failover bond with a few bridges for OpenVPN is difficult --

Code:
set interfaces bonding bond0 hash-policy layer2
set interfaces bonding bond0 mode active-backup
set interfaces bonding bond0 vif 200 address [redacted]/28
set interfaces bonding bond0 vif 200 firewall local name external
set interfaces bonding bond0 vif 400 bridge-group bridge br0
set interfaces bonding bond0 vif 401 bridge-group bridge br1
set interfaces bridge br0 address 10.196.40.240/24
set interfaces bridge br0 aging 300
set interfaces bridge br0 bridged-conntrack disable
set interfaces bridge br0 hello-time 2
set interfaces bridge br0 max-age 20
set interfaces bridge br0 priority 32768
set interfaces bridge br0 promiscuous disable
set interfaces bridge br0 stp false
set interfaces bridge br1 address 10.196.41.240/24
set interfaces bridge br1 aging 300
set interfaces bridge br1 bridged-conntrack disable
set interfaces bridge br1 hello-time 2
set interfaces bridge br1 max-age 20
set interfaces bridge br1 priority 32768
set interfaces bridge br1 promiscuous disable
set interfaces bridge br1 stp false


That's a real configuration snippet and it starts to hint at the complexity of making assumptions about what users want to do. How do you even do this via GUI? If you do implement a GUI and it falls short, that presents a whole new set of challenges for the user. Ubiquiti has a design where the GUI doesn't necessarily trample on CLI sourced configuration.

Certainly the user experience shown above in the Level1Techs stuff is... crappy crap. But it is difficult to solve, and it may not be the right thing to try for a comprehensive fix, but rather some help or tools to help with the bumpy spots, so that a new user could reasonably accomplish common tasks.

networking is getting more complex rather than simpler.

Definitely true.

If there are

As @morganL's network hangs up on him mid-sentence, as if to prove the complexity of networking. :tongue:
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
They don't need to make a setup helper, they should just automatically generate the bridge interface ontop of your systems main interface. Maybe a checkbox is all that would be required.
They chose that way for CORE and are now reaping many unintended consequences:
  • setup violating FreeBSD standards and documentation
  • thus breaking multicast
  • thus breaking IPv6
  • generating interesting bridge loops if people try to use more than one network interface and possibly a dedicated one for jails or VMs
  • ...
Oh the fun. I agree they should have planned for a "vswitch" like concept, but that's a heck of a lot of work to get right.

I don't know enough about the Linux network stack. The problem I am citing for FreeBSD is that a bridge member MUST NEVER have a layer 3 address configured.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
@jgreco Without going direct tit-for-tat, and before I get onto doing actual work stuff I have to do, I wanted to post a response here.

I'm not looking to solve all of the worlds problems here. It indeed would be wholly impractical for IX to focus on building out advanced networking into the product at this stage in their development, although, I think it will ultimately be required for them to do something more sophisticated if they expect Enterprise customers to adopt and use SCALE as a hypervisor and application host.

As you've stated, using LAGs and VLANs are basically a required feature-set for anyone not just using SCALE in their basement for fun. But they can't really do anything to address VLANs without not only a more sophisticated vswitch, but an actual vrouter. And Layer 3 is just out-of-scope of the spirit of this conversation.

Let me explain what I mean. Most folks in SMB are going to do all of their routing on their firewall/router because they don't have someone on staff that has the expertise to manage L3 switches. So then, their network might look something like this, and assuming LACP being generous from what I've seen in the wild.
1671201691146.png


If there are multiple VLANs, then, the packets are not only going north out of TrueNAS to the switch, they are going north out of TrueNAS all the way to the firewall/router, and then back to the other VLAN in TrueNAS. I don't expect IX to build a solution for me here.

If the MAC table on the switch your box is egressing to looks like mine:
1671156535773-png.60979

Then TrueNAS can't fix it for you. Traffic has to be sent to the router. Unless there is a virtualized data-center-in-a-box router, this design inherently has to go north.

For my environment, my Firewall and Switch are connected at L3, and my switch is doing all of the inter-vlan routing for my Homelab, so the penalty is not nearly as high. But that doesn't mean it doesn't exist. I think, at the very least, I can commit to contributing back to the community here by building out a TNSR environment and show folks here how to solve these types of problems in a guide or something. TNSR, or something like TNSR, is the ultimate solution for complex network setups. It's the same reason why Cisco made the Nexus 9000v back in the day and the same reason why NSX exists in VMWare now.

Now, back to the spirit of this conversation, I will admit I do not know all of the intricacies of the Linux network stack. But, what I am advocating for isn't really rocket science. If there were a checkbox like this on the interface screen:
1671202469272.png


And the tooltip was populated with information as to what this all means and maybe has a link that goes back to the documentation and explains the limitations and consequences of checking that button. This would all work fine with single interfaces and LAGs, doesn't matter, and it's a courtesy for the folks who are, unlike you and I, capable of doing something more.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
There's been a good bit of discussion on this in a number of threads, and my takeaway from them has been that this is by design--that the VMs aren't intended to have access to the host (isolation being desired because, well, that's what VMs are supposed to do). In that light, this isn't a "bug that's still present" but a design decision that hasn't been changed. Put differently, it isn't a bug, it's a feature.

And certainly not everyone wants VMs to behave as you're asking; here's just one example of a very recent thread where OP's concerned that the VM does have access to the host:

It's academic for me--I use Proxmox for VMs and don't anticipate changing that any time soon--but it doesn't seem like an unreasonable design decision.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
So I've spent a bunch of time with my thinking cap on today, toying with the idea that we (as in both the participants of this thread as well as the folks at IX) can't be the first ones trying to solve this problem. Way smarter people than I have surely cracked this long ago. I think I have a potential solution that solves all of the issues presented here:

They chose that way for CORE and are now reaping many unintended consequences:
  • setup violating FreeBSD standards and documentation
  • thus breaking multicast
  • thus breaking IPv6
  • generating interesting bridge loops if people try to use more than one network interface and possibly a dedicated one for jails or VMs
  • ...
The complexity comes from the bridge only being needed in some cases and in other cases, multiple bridges being required.
. But in a situation where, for example, a LAGG failover configuration is the primary interface, you're already partway down the rabbit hole. And what about those of us who work in a VLAN-heavy setup, where subinterfaces serving dedicated storage networks are used?
danb35 said:
And certainly not everyone wants VMs to behave as you're asking; here's just one example of a very recent thread where OP's concerned that the VM does have access to the host:
Let's step back a minute here and identify the issue at hand. We're discussing the proper way to design a simple to use mechanism whereby we can facilitate east-west data between VMs running in KVM on TN Scale, and the storage resources on the box itself.

How is this problem solved at scale (bad pun lol) in the enterprise? How does Kubernetes handle this problem? Really my idea here is spawned from jgrecos quoted statement above.

Let's look at some of the common ways folks are getting around the problem right now and why none of them are really the best option. Most of these have already been discussed here, but I want to visualize it.

Something I'd like to get out there is that these diagrams don't include any references to a LAG or bond interface. That's because that construct can exist in any of these designs and does not impact how they function in the ways I am illustrating. Whether the below are single physical interfaces, or bonds on top of multiple, the principals discussed remain true.

Option 1: A dedicated interface

Untitled Diagram.drawio (2).png

This is a desirable design because it:
  • May be the simplest to implement for small systems and users inexperienced in networking.
  • If you understand your workflow well enough, it may be beneficial to dedicate interfaces to specific tasks.
This is an undesirable design because it:
  • Requires two physical NICs
  • Those NICs are statically allocated to a single purpose.
  • Traffic flow is inefficient

Option 2: Using a VLAN
Untitled Diagram.drawio (3).png

This is a desirable design because it:
  • It only requires a single physical NIC
  • Allows for network segmentation for Security, QOS, differing MTU sizes, and a whole host of other reasons.
This is an undesirable design because it:
  • Packets have to potentially go even further north than the above example.
  • Routing the packets add an additional latency penalty, however nominal that is on modern hardware.
  • Traffic flow is even more inefficient
Option 3, The most recommended I've seen, the single bridge



untitled-diagram-drawio-5-png.61059


This is a desirable design because it:
  • It is the most efficient in terms of packet flow
  • IX can potentially include an "easy button" in the GUI to help inexperienced users
This is an undesirable design because:
  • It becomes complicated and difficult to manage if you start adding additional layers of abstraction (ie, multiple VLANs)
  • You are binding bridges to physical interfaces that, by their nature, exist for north/south traffic, not east-west.
  • In SCALE's current form, it's confusing for new users to setup.
Option 4, My proposed Design
In a nut shell, I asked myself the question: Who says the VMs should only have a single interface?



Untitled Diagram.drawio (8).png

This is a desirable design because it:
  • TrueNAS, and VMs who need to access each other or the host can speak to each other at L2 while completely isolated from the rest of the network. This design is not unlike a storage network or how Kubernetes works in SCALE already.
  • Users can implement as much complexity and sub interfaces as they want for their north/south physical NICs, and it has no effect on the "internal" network.
  • You can easily control what VMs have access to the resources in the internal network by omitting the second NIC.
  • IX can potentially include an "easy button" in the GUI to help inexperienced users, similar to the "Advanced Tab" for Kubernetes:
    1671253390762.png
This is an undesirable design because:
  • In SCALE's current form, it's confusing for new users to setup.
  • ??
@jgreco @morganL @Patrick M. Hausen @Kris Moore @danb35
Let me know what you think
 

Attachments

  • Untitled Diagram.drawio (7).png
    Untitled Diagram.drawio (7).png
    107.2 KB · Views: 98
  • Untitled Diagram.drawio (6).png
    Untitled Diagram.drawio (6).png
    107.1 KB · Views: 88
  • Untitled Diagram.drawio (5).png
    Untitled Diagram.drawio (5).png
    90.5 KB · Views: 1,667
  • Untitled Diagram.drawio (4).png
    Untitled Diagram.drawio (4).png
    96.6 KB · Views: 104
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
@NickF Great write-up

Options 3 and 4 would both seem to be viable.
Option 4 pushes the complexity to the VMs, but also provides more power

Sometimes we've seen use want Option 1. VMs are a convenience.. they are not storage centric.
 

ptyork

Dabbler
Joined
Jun 23, 2021
Messages
32
So I know it's likely not the only use case. But if the primary use case is simply to share data with VMs, then why not just consider something like virtfs and allow the user to specify paths to pass through directly? This also does away with some of the inherent limitations of using NFS/SMB like it being effectively unusable for database files. It's so nice to be able to do this with Apps and Docker/K8S. Would seem to be equally nice for VMs.

FWIW, I did a search for virtfs on this forum and see that I'm basically the only one who's mentioned this...multiple times...so I'm sure there's a good reason and I'm just not privy to it.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Is virtfs production ready? I honestly don't know, I'm not that frequently exploring things in Linuxland. Back when we had that Release That Shall Not Be Named it was attempted to use 9pfs from the Plan 9 operating system. That definitely did not work reliably.

My opinion: if you need/want local filesystem mounts stick to lightweight containers like Docker/Containerd on Linux and jails on FreeBSD. VMs are called virtual machines for a reason. Isolation is the point.

P.S. And you still need a bridge to connect the VMs to the outside network.
 

ptyork

Dabbler
Joined
Jun 23, 2021
Messages
32
Is virtfs production ready?
No clue. Maybe it's unstable. Certainly it seems not to be a hugely popular thing since Google isn't exactly brimming with info. Morgan mentioned it once and I think I glommed onto it because it met my needs (wants) so well.

I guess there are really two different but similar things out there. virtio-9p (which I think is actually another name for VirtFS) and virtio-fs. 9p seems to be based on a network file system protocol meaning it may still not help with the database-on-a-network-share issue. virtio-fs looks perhaps to be a more purpose-built, lower-level solution. But I'm not really a Linux person, either, so my interpretation may be crap.

I generally agree with VM isolation as a principle, but it's also possible (nice even) to allow features that optionally bridge that isolation. It's still an isolated machine...just one with a virtualized disk(s) available to mount. I guess at least in my situation I don't see a huge distinction between virtualized/shared network interfaces and virtualized/shared file systems. Though obviously not every situation is the same.
 

urza

Dabbler
Joined
Mar 17, 2019
Messages
38
As an end user, I just want my newly created VM to be able to see and talk to TrueNAS and vice versa. Is that too much to ask? I belive that is what most users expect.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
As an end user, I just want my newly created VM to be able to see and talk to TrueNAS and vice versa. Is that too much to ask? I belive that is what most users expect.
Configure a bridge interface, then.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
As an end user, I just want my newly created VM to be able to see and talk to TrueNAS and vice versa. Is that too much to ask? I belive that is what most users expect.

Most users expect to sit down behind Tesla Autopilot and they would like it to drive them around town. Is that too much to ask? I believe that is what most users expect. Yet Teslas have a habit of flying into the backsides of parked emergency vehicles and other classic catastrophes.

Engineering is not like it is in Star Trek, where they just press a few buttons and suddenly their systems are doing something completely different from the original in-universe designed intent. It may be great for the convenience of the plotline. But real engineering in the real world is difficult, and making default configurations work magically for complicated things such as networking is difficult. If you make one set of people happy, you will be inconveniencing someone else. The discussion above actually outlines a bunch of this.
 

urza

Dabbler
Joined
Mar 17, 2019
Messages
38
Configure a bridge interface, then.
I did. It took days of reading and trying workaround because there is a bug in TrueNAS gui and it needed to be done from console - I documented it here - https://www.truenas.com/community/threads/vms-cant-see-host.88517/page-2#post-697216

It helped few people, but for some even this is not working as you can see in the thread.

This problem has been reported by numerous people, here, on Jira, even famous youtubers like Wendell and Lawrence mention this issue explicitly.
 

urza

Dabbler
Joined
Mar 17, 2019
Messages
38
Most users expect to sit down behind Tesla Autopilot and they would like it to drive them around town. Is that too much to ask? I believe that is what most users expect. Yet Teslas have a habit of flying into the backsides of parked emergency vehicles and other classic catastrophes.

Engineering is not like it is in Star Trek, where they just press a few buttons and suddenly their systems are doing something completely different from the original in-universe designed intent. It may be great for the convenience of the plotline. But real engineering in the real world is difficult, and making default configurations work magically for complicated things such as networking is difficult. If you make one set of people happy, you will be inconveniencing someone else. The discussion above actually outlines a bunch of this.
What are you smoking? Comparing Tesla autopilot to being able to ping TrueNAS from newly created VM? It is pretty basic expectation, and all normal Hypervisors work this way or make it super easy to set it up. In TrueNAS I had to go to physical screen and keyboard and set it from command line console.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It is pretty basic expectation, and all normal Hypervisors work this way

Ahh. There's the problem. You seem to think TrueNAS is a "normal Hypervisor". It isn't. It's a NAS platform, which is optimized towards serving files. This collides with a completely different role of acting as a hypervisor, where hypervisors typically do include bridges by default because that is their raison d'etre. However, hypervisors also don't generally include NAS functionality, so you're failing to account for the TrueNAS developers aiming to develop a product that serves its primary role to the best of its ability. I suggest you go back and read this entire thread in its totality.

What are you smoking? Comparing Tesla autopilot to being able to ping TrueNAS from newly created VM?

Sure. You've got unreasonable expectations. The feature doesn't do exactly what you think it does, just like Tesla Autopilot doesn't do what many people "expect" it to.
 
Top