ZFS Noob
Contributor
- Joined
- Nov 27, 2013
- Messages
- 129
I would like to open by claiming to be nothing other than a n00b, as my username explains. What I'm trying to do here is learn about ZFS, particularly how it's implemented in FreeNAS, and expecially those issues that affect it's suitability as a backend to my XenServer cluster. My VMs are all Linux, and those most sensitive to storage subsystem performance are running MySQL, with a typical 70/30 read/write ratio.
So far I've been impressed with iSCSI performance, and greatly unimpressed with NFS performance. I'm willing to accept that I've misconfigured things somehow, or that something else is wrong with my deployment, but I thought the best way to open the discussion was to document what I've done, and the performance I'm seeing.
The server I'm testing was repurposed, but here's the layout:
I'd like people opinions: are these results about what I should be expecting, or should I tinker with sync NFS more? I've run iSCSI for the last 5 years or so and it's performed fine; I'd just prefer to switch to NFS for administrative reasons (primarily that replicating snapshots is easier, and I can get away with one virtual machine repository rather than having to segregate them for snapshotting.) If synchronous NFS is supposed to be this much slower than iSCSI, then so be it -- I'll rebuild the pool with my SSD as an L2ARC instead and be happy. If there's something else wrong, please point me in the right direction so we can try and figure out what it might be.
PERFORMANCE TESTING
I'm no expert here, but the folks at storage review seem to be. My database workload is 71% read: 29% write, which is pretty close to their 70/30 tests, so I've duplicated their test in FIO to generate the numbers here. My data file was only 1 gig in size, so I'm not arguing that these numbers are representative of what my server can do over sustained workloads, but since my databases are all < 10G in size I hope these numbers are close. I also hope the mirrored drives can keep up with my sustained workload, but there aren't that many writes in my environment, either.
My FreeNAS server is running with the default configuration, FreeNAS-9.2.0-RELEASE-x64 (Zpool created in the prior RELEASE version). The only thing non-standard is that I under-provisioned the SSD myself and added it to the zpool from the command line:
ZPOOL LAYOUT
SLOG Layout (because I think I did this right, but a poorly configured SLOG would explain my problems)
THREE TEST LAYOUTS
I cleared one of my Xenserver hosts and ran the tests on that machine. No other processing was going on other than testing. While the FreeNAS server being tested has 4 GigE ports LAGGed together, unfortunately the XenServer host is using active/passive LAGG, as required by Xenserver/EqualLogic. That GigE connection is limiting the performance we can see here by providing a ceiling.
The only thing changed between these tests is as follows:
Results
What I expecteded was that iSCSI would be the fastest, async NFS would be a bit slower, and regular NFS would run at something like 50% of iSCSI.
I was wrong. Here's the quick summary:
Here's a screenshot I took from the performance graphs on the FreeNAS server while the synchronous NFS test was running, to give you another glance at the performance difference:
iSCSI: Blue. Async NFS: Green. Synch NFS: Red.
Here's the data I collected from FIO:
(Yeah, that's an image because I didn't know how to make a table in BBCode)
So, is this normal, or is something seriously wonky with my SSD, or is something else up?
So far I've been impressed with iSCSI performance, and greatly unimpressed with NFS performance. I'm willing to accept that I've misconfigured things somehow, or that something else is wrong with my deployment, but I thought the best way to open the discussion was to document what I've done, and the performance I'm seeing.
The server I'm testing was repurposed, but here's the layout:
- Dell R710
- SAS/6i SATA HBA
- 4 Seagate Constellation ES.3 SAS drives
- 72G ECC RAM
- 1 Intel 320 SSD, currently installed as a log device.
I'd like people opinions: are these results about what I should be expecting, or should I tinker with sync NFS more? I've run iSCSI for the last 5 years or so and it's performed fine; I'd just prefer to switch to NFS for administrative reasons (primarily that replicating snapshots is easier, and I can get away with one virtual machine repository rather than having to segregate them for snapshotting.) If synchronous NFS is supposed to be this much slower than iSCSI, then so be it -- I'll rebuild the pool with my SSD as an L2ARC instead and be happy. If there's something else wrong, please point me in the right direction so we can try and figure out what it might be.
PERFORMANCE TESTING
I'm no expert here, but the folks at storage review seem to be. My database workload is 71% read: 29% write, which is pretty close to their 70/30 tests, so I've duplicated their test in FIO to generate the numbers here. My data file was only 1 gig in size, so I'm not arguing that these numbers are representative of what my server can do over sustained workloads, but since my databases are all < 10G in size I hope these numbers are close. I also hope the mirrored drives can keep up with my sustained workload, but there aren't that many writes in my environment, either.
My FreeNAS server is running with the default configuration, FreeNAS-9.2.0-RELEASE-x64 (Zpool created in the prior RELEASE version). The only thing non-standard is that I under-provisioned the SSD myself and added it to the zpool from the command line:
ZPOOL LAYOUT
Code:
zpool status pool: nas1mirror state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: scrub repaired 0 in 0h0m with 0 errors on Mon Dec 9 22:45:50 2013 config NAME STATE READ WRITE CKSUM nas1mirror ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/1b5a485d-6129-11e3-a041-0026b95bb8bd ONLINE 0 0 0 gptid/1bce1e51-6129-11e3-a041-0026b95bb8bd ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 gptid/1c420e8a-6129-11e3-a041-0026b95bb8bd ONLINE 0 0 0 gptid/1cb736f5-6129-11e3-a041-0026b95bb8bd ONLINE 0 0 0 logs gpt/slog ONLINE 0 0 0 errors: No known data errors
SLOG Layout (because I think I did this right, but a poorly configured SLOG would explain my problems)
Code:
gpart show ada0 => 34 234441581 ada0 GPT (111G) 34 2014 - free - (1M) 2048 4194304 1 freebsd-zfs (2.0G) 4196352 230245263 - free - (109G)
THREE TEST LAYOUTS
I cleared one of my Xenserver hosts and ran the tests on that machine. No other processing was going on other than testing. While the FreeNAS server being tested has 4 GigE ports LAGGed together, unfortunately the XenServer host is using active/passive LAGG, as required by Xenserver/EqualLogic. That GigE connection is limiting the performance we can see here by providing a ceiling.
The only thing changed between these tests is as follows:
- The first test has the test VM on an iSCSI share.
- Test 2 used an identical VM that was on a NFS share, with sync=disabled.
- Finally, sync was set to standard, the test file was deleted, and the test was rerun.
Results
What I expecteded was that iSCSI would be the fastest, async NFS would be a bit slower, and regular NFS would run at something like 50% of iSCSI.
I was wrong. Here's the quick summary:

Here's a screenshot I took from the performance graphs on the FreeNAS server while the synchronous NFS test was running, to give you another glance at the performance difference:

iSCSI: Blue. Async NFS: Green. Synch NFS: Red.
Here's the data I collected from FIO:

(Yeah, that's an image because I didn't know how to make a table in BBCode)
So, is this normal, or is something seriously wonky with my SSD, or is something else up?