Will Virtualization of FreeNAS EVER be a good idea?

Status
Not open for further replies.

NineFingers

Explorer
Joined
Aug 21, 2014
Messages
77
I know that in the current state of FreeNAS, virtualization is a bad idea. My question (aimed to those in the know) is: will it EVER be a good idea? Might there be some improvements coming down that would tend to make virtualization a worthwhile and safe endeavor?

Please don't flame about the very well-known stance that virtualization is a VERY bad idea... I'm speaking about future directions.

Thanks,
Dave
 
L

L

Guest
Dave,

I actually have the same question. I run it in virtualbox all the time. I know alot of other people who do the same thing.. It would be really nice to know specifics. I had assumed that it was no paravirtualized drivers, but those are there..In fact the freebsd pv driver are probably some of the most mature. Every other openzfs platform runs in v12n. It actually helps with problems like no mirrored boot drives. You can also run up very fast IO by not having to hit the wire.

Still looking for specific technical reason for not virtualizing.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Here is a very interesting quote by jkh.

The truth is somewhat more prosaic. I did the HyperV/9.2.1 release as an experiment which wasn't repeated simply because the required code was on a branch and it was too problematic to keep it up-to-date (anyone else wishing to build their own hyperv images need simply check out the feature/hyperv branch, sync it up to date as desired, and build their own FreeNAS).

I consider virtualization in general somewhat inevitable, regardless of whatever cyberjock may say or feel to the contrary, even though FreeNAS is currently designed to work on native hardware without virtualization interposed. Why? Because it's just too convenient, virtualization technology is steadily getting better, and there will probably come a day when even standard consumer/enterprise PCs ship with a hypervisor and "only fools" actually try to talk to the hardware directly (which may, in fact, no longer even support the notion of native OSes due to TPM / secure boot technology and other BIOS interaction issues which simply make running under virtualization the only way to go). When that day comes, FreeNAS will be just one of many software appliances.

In any case, when FreeNAS 10 is released, HyperV support will come along for the ride along with the VMWare support that is already there. In point of fact, FreeNAS has already shipped with the vmware tools necessary to make it interoperate properly with vmware for a number of years now, so claiming that HyperV isn't supported because "virtualization is bad" would be a rather inconsistent and hypocritical argument to make - that's not the reason at all. We simply haven't gotten around to HyperV in an official capacity yet and have no plans to do so until FreeNAS 10.
There are some interesting posts by @jgreco on the matter as well that aren't as negative as the pervasive attitude.

I'm not a tenured wizard or mod. But I have no doubt at all that this will be done reliably at some point. I would even argue that it can be done now under the right conditions. (That doesn't mean it should be promoted to the masses.) When I have a year of uptime I'll stand by that. For now you can call upon your elder statesmen. Consider SmartOS, watch tiered storage and software defined storage evolve. This is a game being played very hard by big money. MS is betting big in this space and will force innovation.

Someone will get this right and it will be ubiquitous, imho.
 
L

L

Guest
I still haven't found the code that is missing. There was a question recently if you can take a boot drive and just move it to a new server, the answer was overwhelming yes. If that is the case, then my suspicion would be that the kernel is not hardware specific and then would definately be a good candidate for virtualization.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
Nothing wrong with visualization. And FN works fine. It is however a lot easier to break something virtual than a physical machine whose hardware don't change as easily.
Problem is that FreeNAS (or ZFS) have a bad temper and looses pools for no apparent reason. If it where more resilient then there would be no problem what so ever to visualize.

The fact that pools can be lost is demonstrated a lot, usually by using somewhat "flakey" or "lesser" hardware. So that is not exclusive to VMs.

I find it a bit funny that there is an official VMWare release, but you are discouraged to talk about it in the forum and are unlikely to get help for it.
 
L

L

Guest
I will help you. :)

There is a zpool option for flake.
failmode=wait | continue | panic

It was put into zfs early on for v12n. You can set your failmode to continue so if a disk is missing zfs will continue. Wait is the default.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
The problem is there are a huge number of documented failures. There are too many variables to narrow down specific failure modes, imho. I think the opinion that many who try do do it try and go cheap and do it inadequately is accurate. I think that skews our reliability data. There are also significant user errors that skew the data. In addition there have been so called "experts" that fall on their face just as hard with a random pool loss. Plus how do we statistically measure the delta between baremetal failures and virtualization failures when there are so many variables and so much user error? It's all subjective bs on tiny datasets.

Add to that the fact that it is incredibly complex to troubleshoot a newbs virtual environment, and the time involved to do so... and you have a perfect recipe for mod burn-out. It's easy to get wrong, and requires proper resources and expertise to have a shot at reliability. Even then it is simply an added risk that may not have any justification. It is orders of magnitude cheaper to throw hardware at a dedicated box, than have a pro fight esxi for even a day or two to make it work reliably.

I'm probably crazy. But to me this is just another OS. It's a cherry picked FreeBSD kernel with a nice gui. That ain't scary and doesn't rule out virtualization in the slightest. But meh. ;) Obviously it can be screwed up easily and often. I can't break the thing, and I try hard.

We know ultra paranoid is a valid approach to storage. It is also valid to protect people from themselves. Honestly in my perfect world, there would be an iX supported VMWare appliance. I could then hand off the big jobs to the dedicated trueNAS boxes or a mid level tier if they had it.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
If you know anything about the topic you know it's not really in our control. We not only don't know *why* it suddenly fails, we don't even know *how* it fails. We've never been able to get a case where it was easily reproduced. Some have no problems for months, others can't get it to stay up for 24 hours. It works, someone does something very simple like reboots their ESXi server (properly I might add) and when they try to power-on the VM their pool is gone. No warnings, no errors, nothing. It's there, then it's not there.

As for Linda wanting to know the technical reason, we are clueless (which I've said in quite a few threads you have been involved in on this topic). We don't have answers, an ESXi customer called ESXi and their answer was "Sorry, FreeNAS isn't on our approved list, so we don't support it."

No clue if the bug is on ESXi's side of the house, or ours. But, considering the fact that we never see these problems on bare metal installs I tend to think that the hypervisor is doing something inappropriate for whatever reason. The weird thing is that FreeBSD itself doesn't seem to have this issue (this could be because FreeBSD users are typically higher-end users so they know better and fix stuff on their own, I don't know). There are too many variables to make the argument that the bug is FreeNAS exclusive, but there's not enough info to point the finger at any one thing.

ZFS should have access to the bare drives, so regardless of if the bug gets fixed tomorrow, next week, or never, unless the design considerations of ZFS were to change, it's going to be hard to truly claim it's something that's "a great idea".

And VT-d would still be a very serious requirement because SMART won't function without it. So there are still technical hurdles that don't appear to have anything that is expected to change in the forseeable future. But, who knows what comes next???

The reality is that I wouldn't expect this to change until an ESXi 6 comes out. No clue if/when that's gonna happen as I'm not a VMWare customer. Assuming they rewrite much of their code they might fix the problem by chance. We don't know.

Still as confused as I am? That's because we just don't know.

Lots of questions and damn few answers.

Nothing wrong with visualization. And FN works fine. It is however a lot easier to break something virtual than a physical machine whose hardware don't change as easily.
Problem is that FreeNAS (or ZFS) have a bad temper and looses pools for no apparent reason. If it where more resilient then there would be no problem what so ever to visualize.

The fact that pools can be lost is demonstrated a lot, usually by using somewhat "flakey" or "lesser" hardware. So that is not exclusive to VMs.

I find it a bit funny that there is an official VMWare release, but you are discouraged to talk about it in the forum and are unlikely to get help for it.

Yes, but you are missing information on why that was released. I'm also not going to discuss it because it's not in my pervue to discuss it.

I will help you. :)

There is a zpool option for flake.
failmode=wait | continue | panic

It was put into zfs early on for v12n. You can set your failmode to continue so if a disk is missing zfs will continue. Wait is the default.

@Linda Kateley

Actually, the default is continue. But it doesn't do quite what it sounds like from your post. Asking around a year or so ago the only answer I've really gotten back is "it doesn't do what we had hoped and it never will without restructuring ZFS". I could find almost no useful documentation on this property nor could I find an example where it actually worked. So I have to think that it is useless. Do you have any detailed documentation on the property or a link to someone that used it successfully?

Virtualizing is like going to a ballet. If everyone stays in step then it's a beautiful thing. But if one person doesn't have their game-face on then everyone ends on laying on the floor looking silly.
 
Last edited:

NineFingers

Explorer
Joined
Aug 21, 2014
Messages
77
Thanks CJ -- that's the answer I was looking for. This point forward, I'll just be happy my FN is functioning well doing what it was designed to do.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Nothing wrong with visualization. And FN works fine. It is however a lot easier to break something virtual than a physical machine whose hardware don't change as easily.
Problem is that FreeNAS (or ZFS) have a bad temper and looses pools for no apparent reason. If it where more resilient then there would be no problem what so ever to visualize.

Generally speaking, the issue is that FreeNAS is intended to be an appliance, an interface for physical hard drives intended to make them a resource for use on the network.

When you create, let's say for the sake of argument here, a DHCP server VM, you take a copy of FreeBSD and install it on an 8GB virtual disk. Smart IT admins using it for production at their company will have it on a DAS RAID controller, or perhaps a SAN RAID controller. Home users may have it on an unprotected nonredundant datastore, because, y'know, a quality DAS RAID setup costs $1K+ just by itself (LSI 9271-8iCC plus CV). Loss of the datastore means loss of the VM. This is pretty much expected behaviour, right?

The problem is that virtualization encourages desperate people to do dumb things. So you probably want to share more than 8GB. So you decide to put a bunch of SATA drives on your ESXi box's SATA ports, make a bunch of datastores on them, and then make some monster-ass virtual disks on them. Put FreeNAS on top. Seems to work great. There are, however, multiple issues here, including that you cannot get status of the drives, so you lose out on early failure prediction, and that when a drive does fail, ESXi appears to wedge the VM's I/O subsystem until the datastore becomes functional again ... which places you in a real bad situation, because it isn't going to become functional again. So you have to hit reset on the VM, which is, frankly, dangerous. It becomes more fun because often these people want to use FreeNAS as the SAN for their ESXi ... adding much complexity to an already nonideal situation.

Now, honestly, if you have an enterprise grade virtualization setup, one with high reliability (RAID and maybe multipath) datastores, you can create virtual disks on top of that, and run FreeNAS on it, and FreeNAS will be as reliable as the underlying datastore. But doing so still causes some issues: what do you do about ZFS self-healing, for example? You need to use multiple disks so that ZFS has redundancy available at a layer that it can access. So now you're maybe doing RAID on top of RAID. This is not going to be great for performance.

There are no improvements coming that can cure stupid, or mitigate against lack of knowledge. These are the slayers of ZFS. FreeNAS in a VM is trivially possible and very pleasant if done right, but generally speaking the people who come to the forum looking for help are looking for a blessing to do it in some Wrong Way, or some Very Bad Way, or some Absolutely Going To End In Tears way.

My theory is that if you want ZFS to store your data, then you value your data, and if you value your data, then you shouldn't do things in some Wrong Way, or some Very Bad Way, or some Absolutely Going To End In Tears way.

So it isn't really a software problem. It's a PEBCAK class issue.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
I get that the problem is not easily fixed nor easily reproduced. And that the devs would fix it promptly if they could. But it does not change the fact that it happens or that it should not be happening.
And throwing hardware at the problem does not seems like a long term solution. But at least for now that solution gives a really awesome NAS in the process.

Not sure what to say to the fact that no one seems to know how ZFS works or fix problems with it.

Yes, but you are missing information on why that was released.
I probably am, but it's not on the download page, nor in the manual what I saw. So it's a valid option when choosing deployment method. If it where not then it would be hidden, together with a separate manual that would also state the special purpose of it.
But again, I'm not a mind reader so I can't see the intention of it, only that it is there, with equal status to the other deployment methods.
It is also fine if I don't get it as I am not using it.

Does ZFS version matter to the problem, or will it happen just as much with V5000 as V28?

*edit* where typing at the same time.
I won't argue to those points as I have witnessed first hand how stupid ESXi is with corrupt/bad sectors.
Doing it wrong or doing it for the wrong reasons is applicable to most setups, including bare metal.
And emulating a physical topology in a virtual environment is just asking for problems.
I'm not saying that a VM deployment should be treated the same way a physical would.
As you say it can be done and give a pleasant result. (if it keep working)
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I get that the problem is not easily fixed nor easily reproduced. And that the devs would fix it promptly if they could.

The problems are easily reproduced; you simply do stupid things and "poofda."

The devs are not allowed to fix PEBCAK class issues, sadly.

My own attempts to "fix" the problem through discussion have led lots of people to successful VT-d based deployments, and perhaps a few FreeNAS-on-top-of-SAN style deployments too.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
That sentence where directed to CJs post. ""If you know anything about the topic you know it's not really in our control. We not only don't know *why* it suddenly fails, we don't even know *how* it fails. We've never been able to get a case where it was easily reproduced.""

I was typing at the same time you posted so I saw it only after. I added some edits to it.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
The problems are easily reproduced; you simply do stupid things
QFMFT ;) Thanks for that one.

I often think FreeNAS is just a gateway drug to ZFS. Certainly sucked me in.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
CrackMansion.png
 
L

L

Guest
So vmware doesn't support any zfs although there are thousands of people using it. I had this problem at Sun and then at nexenta, people would ask me, do you support xyz? I would say no, i support my product. Vmware does support vm's on nfs and iscsi, but they aren't going to help debug the storage issues unless maybe it is emc or you buy one of those huge contracts that include storage problems. If you have one of those, let me know...I will help fix what's broken. BTW emc has never had support for zfs, same with netapps but for other reasons(actually they had support for a very short time and then...)

As far as the property, I actually got it put back for a customer, while at sun, that had all their zones running on zfs on EMC storage. The storage would go missing, and everything would seize up. It works and has worked for them ever since.

So if there are 100's of documented issues, where are they documented? Are there bugs filed? if so that is where we all will get an answer. Bugs are the way to make change. I will poke around and see if I can find a bug. If everyone continues to file against that bug, then you can influence change. Chatting in the forum won't make anything

Actually I did check and with a quick scan can see that most of the fixes are put back into trueos, which is what people should be using anyways in a commercial production environment. It includes the VAAI cases. If you are going to use vmware you will want the VAAI stuff. Especially the iscsi unmap. I 128 bugs with the word vmware in it. The first dozen or so I scanned people are doing weird stuff ..

IMHO freenas is great for test and dev... truenas for prod.
 
L

L

Guest
I also will go along with people who said, if you want to go cheap, you will get cheap. But if someone wants to use freenas with a supported version of esxi then we already know they haven't gone cheap. Continue and get a supported storage platform. Play with freenas, deploy truenas
 
Status
Not open for further replies.
Top