FreeNAS as a NFS heaad for an enterprise SAN?

Status
Not open for further replies.

justin12

Cadet
Joined
Oct 13, 2013
Messages
1
Details all below, but I'm basically interested if I can expect 15+ms of latency using ZFS exported to VMware due only to the O_SYNC issue, regardless of IO speed or ZIL speed.

In our testlab we have a bunch of servers with a few different enterprise storage arrays. Currently we use a VNXe to provide NFS datastores for VMware hosts. The VMs that go on these datastores are generally not IO intensive and are used as basic Windows servers (web, AD, vmm, dhcp, etc); there are probably around 60 of these active at any given time. Anything that needs "good" IO goes on block storage using fiber channel. Using NFS for these types of things has the huge advantage of allowing us to free up FC switch ports for other bandwidth hungry applications. If SAN switches and FC HBAs were free everything would use FC, but alas, they are not, so NFS is where I'm at. I prefer not to use iSCSI.

We are migrating most of our general purpose testlab storage to a VMAX 20k. I'd love to use something like FreeNAS on a VM that sits on a host with FC access to provide NFS storage to various VMware servers.

I've read a bunch of forum posts and followed many links posted out there by cyberjock (thanks) and others in an attempt to understand the best way to set things up. I've looked for other instances of people using enterprise storage behind FreeNAS and haven't found it. I know that I can buy an enterprise NFS filer head for the array. Let's just say a requirement here is not to spend money and leave it at that.

The physical host is a Cisco B420 (2.8Ghz Sandybridge); all hosts have dual 10Gbit NICs. The VM right now has 2 vCPU 4GB of RAM. I know that's a low amount of RAM for an actual in-use FreeNAS appliance but right now I'm just using a synthetic loadgen to test write performance on a single 1GB file from a single VM; scaling the VM up to much more memory and CPU is not a problem and is expected in the future. I'm using storage from a tiered pool of disks on the VMAX comprised of EFD and 15k FC and testing random write performance (on a single 1GB file), but none of that should really matter since all the writes get absorbed by the array cache.

I created a striped ZFS volume with a pair of 100GB disks, exported the volume, mounted it on a VMware host and created a disk on an existing VM that also has a disk from a VNXe via NFS. Since the goal is to get comparable performance I opted to run tests against both and see what happens. Given what I read in the forums, I wasn't terribly surprised that write performance had some latency. I'm seeing around 2,000 8K writes/s @ 3ms of latency -- not bad, but the same test on the VNXe got 4,000 8K write/s at 3ms of latency. Bumping up the # of threads in the test from 8 to 32, the VNXe jumps to 7,500 writes/s at 3ms of latency and FreeNAS stays at 2,000 but latency jumps to 15ms. iostat on the freenas host shows IO service time around 1ms, so the 15ms on the guest VM isn't at disk layer -- I'm guessing it's O_SYNC related. I read that others have alleviated this issue by putting the ZIL on an SSD -- given that I've got a huge amount of array cache any LUN from the array is basically an SSD from a write latency standpoint. I tried presenting another LUN and designating it as a ZIL but that didn't change anything (iostat did show that it used the ZIL and flushed to the data disks every few seconds). UFS is worse. Disabling sync on the volume drops latency to 1ms (or less) at 10,000 writes/s. I'd prefer to not rebuild a crap ton of VMs so I'm guessing I shouldn't actually use this, but it does help narrow down the issue.

targetthreadswrites/slatency (ms)
VNXe84,0003
FreeNAS82,0003
VNXe327,5003
FreeNAS322,00015
FreeNAS (no sync)810,0001

I expect some latency just due to the number of hops to the disk, but is this type of latency expected inherently with FreeNAS? If not, what did I do wrong?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
For starters, you said "given that I've got a huge amount of array cache any LUN from the array is basically an SSD from a write latency standpoint" which I take to mean that your hard drives are on a RAID controller with on-board cache. I can tell you from personal experience that using hard drives on a RAID controller with write cache enabled will kill your pools performance by significant(think 50% best case) because ZFS takes turns with reads and writes and doesn't try to do both at the same time. Thanks to your RAID controller with the on-board cache ZFS will "write" to the disk(which is actually cached) and then when ZFS tries to read your cache will flush to disk. Instant performance killer.

Secondly ZFS is tuned to provide high performance with lots of RAM. I realize you are planning to scale it up later, but you can expect poor performance until you actually scale it up. So in essence, you need to change your VM test box to have the amount of RAM you plan to use with it(or at least give it something like 16GB, 32GB would be better).

Thirdly, ZFS on FreeNAS is tuned to provide high throughput. That comes at the cost of higher latency. Depending on your loading, your latency needs, and the hardware FreeNAS runs on you may need to tweak ZFS. I will warn you, ZFS tweaking is not for the faint at heart. It can take you weeks of dedicated reading and experimenting to get it right. Most users will give up and go to UFS, which provides much better latency because it doesn't have the ZFS caching. But UFS can still suffer from potential gains or losses from your on-card cache, so more experimenting will be necessary.

ZFS is very complex, and no matter how complex you think it is when you read up on it, its even more complex than that. Trust me, I'm even scared to try to tweak ZFS because of how much stuff you have to know.

Keep in mind that this is a limitation of ZFS and not FreeNAS.

Good luck.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Possibly flooding that array cache. I wonder what he thinks "a huge amount of array cache" is, and what the actual flush policies are.

But anyways. I would say it is a bad idea to use an artificially low amount (4GB) of memory. This stresses ZFS. It is expected to be slower with less write cache. It may also be bad to try setting up a SLOG device on an array that has unknown characteristics.
 
Status
Not open for further replies.
Top