Great topic from jgreco, it has made FreeNAS a reality in my home lab / consolidated server, when there were times when it was easier to quit. I'll be honest, I was dragged kicking and screaming into compliance, but I'll share my journey and my reasoning, so that you can avoid the same pits I fell into.
Years ago, I purchased a WD "MyBook" NAS-in-a-box and when I only got 5 MB/s performance over gigabit ethernet which made it essentially impossible to back up and too slow to use, I ripped the disks out of it, and put them in my PC mirrored on a cheap VIA RAID card - altogether better but defeating the objective. Fast forward 5 years, my PC was long in the tooth, the VIA RAID driver proved to be the cause of increasingly frequent BSODs and hadn't been updated in years, so time for the next computer. Having gained much experience of ESXi in the mean time I figured it was time for a home lab and another stab at NAS. Exactly what this thread cautions against.
My server build is based on the ASUS Z9PA-D8 dual socket 2011 (Xeon E5 series) motherboard, with LSI 9240-8i controllers for the storage. For decent performance and minimum rebuild times I wanted lots of small disks - this allowed SATA rather than the much more expensive SAS. I have 8 x 750 GB 2.5" WD Black as the NAS zpool, another 4 as RAID0 temporary storage (eg. for HD video editing) and for pillaging when a NAS drive does need replacing, and 4 x 1TB 3.5" WD Black as RAID10 for the hypervisor and VMs. The 12 x 750GB disks are in a pair of Icy Dock drive bays.
I agree unequivocally that the ZFS volume has to be accessible via bare metal when the hypervisor fails, but where I didn't agree particularly was that hardware RAID should be bypassed. The whole point of a smart LSI controller with its own PowerPC processor is to manage and recover from drive failures without loading it onto the processors or operating system. I set about doing performance tests with ESXi and was sorely disappointed with the performance of the LSI 9240-8i as RAID5 - under VMware ESXi 5.5 just copying VMs (serial write of large objects) averaged a pathetic write speed of 5 MB/s and reading averaged 100 MB/s. Using disk testing tools within VMs, write speed was up to 60 MB/s and read speed up to 2400 MB/s, but this wasn't representative of normal use. I figured that the LSI was not performing well, and reconfiguring as RAID50 would confirm this if it were just as slow - which it was! So at this point I shrugged, reconfigured the drives as RAID10 and got fantastic performance. Then a bit of googling confirmed what I already knew by then: the LSI 9240-8i isn't good at XORing
I built my FreeNAS on this RAID10 (stubbornly determined to stick to the hardware doing the job it is supposed to) booting from USB stick and all was wonderful. Performance was fabulous. I then went to build it on ESXi 5.5 with PCI passthrough of the LSI 9240-8i .... and it started with tears and got nowhere. I eventually found that there's a bug in ESXi that has been there since version 4 and they seem to be in no hurry to fix - the FreeBSD driver resets the LSI during boot configuration and it never comes back - just a succession of timeouts reported. At this point I was on the verge of giving up on FreeNAS, but lots of googling later I figured that it was worth persisting because there was no better candidate and one thing I don't have time or enthusiasm for is reinventing the wheel.
So, I finally conceded defeat on the LSI 9240-8i on the basis that VMware won't fix the passthrough and reflashed it as a basic LSI 2011-IT, rebuilt my USB stick install of FreeNAS and got a nice RAIDZ2 volume. What I don't understand is how 8 x 750GB produces just 2.5TB of RAIDZ2 volume, when I'd have expected it to be about 4.5TB. I must be missing something.... My hunch is that FreeNAS configured it "optimally" as a pair of 4 x 750GB RAIDZ2 volumes, so I have lost half the disk space to parity, with the ability to lose 4 of the 8 disks! I then went to build it on ESXi with passthrough of the masqueraded "LSI 2011-IT" and it works fine. I cheated on the VMware tools installation - just took the FreeBSD 9.0 drivers from them and didn't install PERL or run any scripts - but this works a treat too and I have my VMXNET3 interfaces working at 10Gbps and FreeNAS delivering fabulous performance both to VMs and physical hosts.
There is however a down side. When in 9240-8i and RAID 10 mode, I could unplug and replug a disk and it would just rebuild on the fly, just as it should. Now that the LSI controller is in humble 2011-IT mode, I can unplug and replug a disk but FreeNAS doesn't detect the replacement and a reboot is needed to pick it up again. Satisfactory for a home NAS but not in a real production environment (read the subject of this thread again;) ). Thus my preference for hardware RAID appears vindicated - had it actually worked through ESXi! However, FreeNAS appears to be pretty clever at working out that the replugged drive is still fine.... I will have to comb the syslog to see whether it did actually do a rebuild during boot before I could get back into the GUI.
So, aside from the mysterious loss of the storage space, I'm mostly there. It's not necessarily the end of my journey though because I'm also trying to pass through a GPU to have the ESXi server function as a decent workstation too. That makes reflashing the LSI seem trivial - there are guys changing the resistors on their GPUs to get them to masquerade as different graphics cards on the PCI bus to enable them to pass through. The crux of the issue is backward compatibility with all the previous defunct graphics standards, that ESXi PCI passthough doesn't deal with and VMware don't care about because it's a server hypervisor! I'm not prepared to put my soldering iron to a brand new GPU so have a supposedly good bet on order, with pass through possible both on ESXi and XenServer. If ESXi proves impossible and XenServer works, then I'll be revisiting my FreeNAS build again and attempting to get it going on XenServer, so for now it's a test bed and I will be hammering it to make sure nothing breaks before transferring my data across for the final time.
I've spent my last few years working for software vendors who have insisted on bare metal deployments whilst the tide of virtualisation and cloud has turned against them, reluctantly bowing to the inevitable and adapting. Basically, if you don't support both virtualisation AND cloud, expect to become obsolete and forgotten within just a couple of years. I hope that the FreeNAS team will choose to adapt rather than become obsolete and what we're doing here will become normal and supported
