Rules of thumb re: NFS vs iSCSI choices in virtualized environment

Status
Not open for further replies.

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
I've delayed my testing for the last couple of weeks due to care for an ailing family member, but I've got FreeNAS on my mind today. Distractions are good sometimes.

Anyway, I'm running a Xenserver cluster of 3 hosts, and I'm evaluating a move from EqualLogic iSCSI storage to FreeNAS. Until now I've just used iSCSI because it was recommended, but now I have options and am trying to understand the implications of one choice over the other.

My test server has 72G of RAM, 4 drives in a RAID10-ish configuration, and a properly partitioned Intel 320 SSD as an SLOG. 4 GigE links in the server are set up as LAGG across a pair of switches that support LAGG, and it's working fine.

Now we get to choosing how to connect to FreeNAS. iSCSI just works, mostly. Xenserver supports it natively, and it looks fast because Xenserver does asynch writes to iSCSI. The problem is that I'm really giving up snapshotting ability with iSCSI unless I map iSCSI shares to files, and if I do that then I'll need to create a separate iSCSI share for each VM if I want to be able to rollback a snapshot for a particular VM.

If I use NFS then snapshots work a bit better. I don't know that I can recover a single VM via rollback, but at least I can clone a snapshot and move the affected VM data back with a reasonable degree to granularity. The problem I'm seeing is that NFS is slower due to asynch writes, even if I tell ZFS to treat sync=disabled for testing. I'm assuming this is due to the overhead related to synchronous writes, and the way to eliminate this would be to disable synch on the Xenserver cluster itself. Which, of course, is not recommended.

The performance penalty isn't huge, but it's noticeable. The simplest test was to reboot a VM and time it, then migrate the VM to the test array and try again. Initial results were:
  • Shutdown time: 21s for EQL, 40s for NFS on FreeNAS
  • Complete reboot: 51s for EQL, 1:18 for NFS on FreeNAS
Question 1: Before I get to deeper testing, is this level of slowness comparable to what you'd expect when moving from iSCSI to NFS in general, or is there more going on here that's slowing down my system that I haven't isolated yet? The EQL should be pretty heavily optimized..
Question 2: Documentation in Xenserver doesn't really cover this, but I'd assume the transaction-style writes that ZFS performs would be safer as an iSCSI store than straight iSCSI, wouldn't you? I'd also assume that async NFS writes with FreeNAS as a backend would be safer than straight iSCSI, though I think if I test enough I can get sync to be about as fast as async for running applications. Measurable differences, but unnoticeable all the same.
So where are my expectations off, and where should I be looking to test next?
Applications are database-backed web servers, mostly, with APC on the web servers and MySQL doing caching on the SQL servers. Database size is variable, but < 8 gigs of tables for the largest now.
 

KTrain

Dabbler
Joined
Dec 29, 2013
Messages
36
Based on the reading that I've done I feel like NFS is generally the "rule of thumb" option for shared storage as a virtualized environment resource. Not sure what to recommend for testing "going forward", but I'd stick with NFS regardless.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The snapshot issue is correct. Snapshots of a single iSCSI extent are easily made but then annoyingly useless because you'd basically have to overcome difficulties with making an additional read-only export, and then maybe problems in the virtualization layer such as the datastore name and identifiers colliding with existing production data (I imagine a train wreck is possible for ESXi and probably Xen).

Without specific knowledge of the pool setup, it is hard to comment on possible optimizations. Do note:

1) RAIDZn is bad for performance and bad for storage efficiency. ZFS likes large blocks but this interoperates poorly with something expecting to write small blocks, like virtual disks. The interactions can be rather opaque.

2) sync is simply a function of whether or not writes are committed to persistent storage before acknowledgment. You always want this, unless you don't value your VM data. Basically imagine the mayhem if you were running a computer, disconnected the hard drive, the OS then pushed out some writes to the disk without noticing it wasn't there, and then you reattached the hard drive. Might seem harmless, but what if what it tried to write was metadata such as the free block table? Then suddenly what the computer THINKS and the hard drive's actual metadata are inconsistent, and things could go badly from there.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
jgreco,

1) I agree, so I default to RAID10 for deployments. I just don't like the trede-offs with parity RAID, though I make exceptions for backup servers which all run RAID6.

2) I understand this conceptually, but in practice I couldn't make the SLOG work. I cheaped out with an Intel 320, but I was seeing performance with the SLOG-enabled NFS with sync=standard that was around 2% of what I was getting with iSCSI. I also had a random reboot happen there (I believe it was because the NFS system was so slow and I'd turned on 5 minute snapshot/snapshot pushes to another FreeNAS box in my rack), and when the storage came back up a FSCK was required. With the 320 configured as a SLOG. Add in the 30 second login process and other issues related to speed and I just gave up on it.

iop_comparison.gif


For now, I think I'm stuck with iSCSI. If I had a larger budget I'd build a custom server for FreeNAS, include > 128G of RAM rather than my 72, and spring for something like a ZeusRAM just to see how things work, but I really can't justify all that at this point.
 

viniciusferrao

Contributor
Joined
Mar 30, 2013
Messages
192
ZFS Noob, I've your setup and let-me post my experience.

Today I've a shitty performance due bad choices in the past. A zpool containing a single 10 disk RAID-Z2 vdev with 3TB disks. My IOPS sucks hard. Bandwidth is fine, but my Exchange Server is waiting 20 seconds for any IO operation! This is mainly due this bad choose and limited RAM on my system: 32GB due Xeon E3 platform.

I'm migrating the VM to another FreeNAS Server just to offload the pool. Then I will setup the new hardware with two Xeon's E5 and 64GB of RAM. I will use the same disks (plus 4) to create a 14 disks RAID10-like ZFS beast with SLOG and L2ARC. We already have 120GB SSDs for L2ARC (not SLC, but with high endurance).

All our LVM over iSCSI pools run with sync=always. Keep in mind that sync=standard is NOT an option running VM's. I've done a test on the past "just to see" the performance of sync=disabled in a VM pool and I've lost my test VM. The xvda disk was beyond repair, it only mount as read-only and always complaint about not root permissions to run fsck. So, please, use zfs set sync=always <yourVMpools>.

The snapshot feature is a no go. Unfortunately I only create snapshots inside the LVM-over-iSCSI in XenCenter.

Another thing that I'm testing now is compression. It appears to raise the IO, but I'm not sure how compression works over iSCSI. I can perfectly understand it on File-level protocols like NFS, but with iSCSI serving blocks, I can't understand how compression will work. And how available space will be presented to the Hypervisors (XenServer in my case).

Wrapping it up: RAID10-like, lots of RAM, iSCSI with sync=always, L2ARC, ZIL, Snapshots on hypervisor.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
All our LVM over iSCSI pools run with sync=always. Keep in mind that sync=standard is NOT an option running VM's. I've done a test on the past "just to see" the performance of sync=disabled in a VM pool and I've lost my test VM. The xvda disk was beyond repair, it only mount as read-only and always complaint about not root permissions to run fsck. So, please, use zfs set sync=always <yourVMpools>.
What kind of performance are you seeing there? sync vs async on NFS has a horrendous performance penalty on my server, and if synchronous iSCSI is as slow, then it simply won't be usable.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
There isn't a reason to expect that the performance hit wouldn't at least be similar.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
There isn't a reason to expect that the performance hit wouldn't at least be similar.
That's my thinking too. :( Is this comparable to the performance hit others see, or is it at least close enough to determine nothing's clearly broken with my setup?
With things running that slow (VMs take 8 minutes to boot, 20-30 seconds to login) things are just unusable. I'd prefer to run sync=always, and I'll probably set that as a goal when it's time to spec out a new storage server later on.
For now, I'll do iSCSI and rely on fsck and redundant backups in case of problems. I'm in a level 4 datacenter with solid power, but you never know what might happen...
 

viniciusferrao

Contributor
Joined
Mar 30, 2013
Messages
192
That's my thinking too. :( Is this comparable to the performance hit others see, or is it at least close enough to determine nothing's clearly broken with my setup?
With things running that slow (VMs take 8 minutes to boot, 20-30 seconds to login) things are just unusable. I'd prefer to run sync=always, and I'll probably set that as a goal when it's time to spec out a new storage server later on.
For now, I'll do iSCSI and rely on fsck and redundant backups in case of problems. I'm in a level 4 datacenter with solid power, but you never know what might happen...


Things are slow here due bad implementation: RAID-Z2 of 10 disks. But it's not like 8 minutes to boot. I can reset one server now and it will be UP in some seconds. I don't have numbers now, because I haven't benchmarked the pool correctly.

But if you want/need numbers, XenCenter reports some disk usage at this moment in some servers:
Exchange 2013: 3412/7890
Linux Repo: 3268/9802
Zabbix: 312/431
20 VM's with average 2/3

I can post this screenshot of Windows System Monitor:
https://www.dropbox.com/s/b8q1njp4on9zfvd/Screenshot 2014-01-13 01.58.36.png

Even with the sluggish pool, the performance is acceptable. You can get the login page pretty fast at: https://webmail.if.ufrj.br
 
Status
Not open for further replies.
Top