FreeNAS9.3, RAIDZ2, iSCSI slow read

Status
Not open for further replies.

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
Hey guys and girls,

I play around with FreeNAS since last year. Till now I only had some small test machines. Now I use 2 FreeNAS boxes for only serving iSCSI (and one NFS share for the ESXi logs). But I get strange results in read/write speed.

This is the setup:

#1 FreeNAS 9.3 x64
  • Intel S1200KPR miniITX
  • Celeron G1610 (2.60GHz Dual-Core)
  • 8GB Kingston ECC RAM
  • 4x WD Red 1TB (WD10EFRX)
  • Dual-NIC onboard
  • RAIDZ2
#2 FreeNAS 9.3 x64
  • Intel S1200KPR miniITX
  • Celeron G1610 (2.60GHz Dual-Core)
  • 16GB Kingston ECC RAM
  • 4x WD Red 1TB (WD10EFRX)
  • Dual-NIC onboard
  • RAIDZ2

On #1 there are 7 ZVOLs all shared via iSCSI. Compression is off and the ZVOLs are created using the default values. All targets are accesed through Windows Servers using MPIO, all formatted with NTFS).
On #2 I have created 5 ZVOLs. 3 for an ESXi 5.1 host, running a Windows Server 2008 R2, Windows 8.1 x64 and XUbuntu 14.04.1. The other 2 are accesed by the Windows DC for Exchange-DB and some files for sharing (CIFS sharing with FreeNAS caused some trouble with different devices, like scanners).
Everything is connected on the same subnet.

running "dd if=/dev/zero of=/mnt/ZPOOL/testfile bs=4M count=10000" results in

#1
41943040000 bytes transferred in 209.444176 secs (200258803 bytes/sec) => ~190 MB/s, CPU ~20%

#2 (with running VMs, but idle)
41943040000 bytes transferred in 238.388392 secs (175944137 bytes/sec) => ~167 MB/s, CPU ~20%

This seems ok to me. But when I try to copy files on the mounted iSCSI drives I'm confused.

Server
  • Intel XEON 5130
  • 24GB ECC Kingston RAM
  • SAS LSI-RAID10
  • Dual Intel Pro/1000 EB
  • Server 2008 R2 SP1 x64
Here I mounted from FreeNAS #1 a iSCSI with MPIO roundrobin. Writing an 4GB ISO to FreeNAS with ~120MB/s and reading the same file back to the server is about 20-50MB/s. (Network on FreeNAS is about 700-800MBit/s for writing and up to 400MBit/s for reading, using both adapters.)


ESXi 5.1 host
  • XEON E3-1225
  • 32GB ECC Kingston RAM
  • iSCSI with Intel Gigabit ET Dual Port Server Adapter (E1G42ET)
  • Multipath, roundrobin, IOPS=1
running "dd if=/dev/zero of=/vmfs/volumes/datastore/testfile bs=1M count=40000" (FreeNAS #2) results in
~50-60 MB/s, CPU ~25%, Network 200MBit/s


Does this look right?
I believe, reading has to be much more faster and the ESX should get much more MB/s?!

Yeah, I know, that I should use MPIO with more NICS and different subnets. But I think this is not the bottleneck... Please correct me, if I'm wrong.

Thanks in advance.:)
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Have you tried to test this without MPIO? I have suspicion that MPIO may cause some request reorder, that was found very badly affecting ZFS read-ahead prediction code. In FreeNAS 9.3 I included special optimization against this issue, and it greatly improved read performance for single link, but I don't know whether it is so efficient in case of multipath.
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
Have you tried to test this without MPIO? I have suspicion that MPIO may cause some request reorder, that was found very badly affecting ZFS read-ahead prediction code. In FreeNAS 9.3 I included special optimization against this issue, and it greatly improved read performance for single link, but I don't know whether it is so efficient in case of multipath.

Tried without MPIO on the ESXi host, but no change. Only getting about 400 MBit/s and ~35-45 MB/s. I know, ESXi is cappable of making this much more faster, but I can't get it working with freenas? What am I missing? (Don't want to blame FreeNAS, but with a QNAP NAS it did 80-100 MB/S before the switch to FreeNAS)

The Windows servers are now writing up to 200 MB/s (MPIO) but reading still stucks at ~20-50 MB/s. So the MPIO should not be the problem, regarding the Windows iSCSI.
 
Last edited:

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
Forgot yesterday to "dd" from ESXi to FreeNAS #2 with "bs=16384" (ZVOL blocksize is 16KB). This run gave me ~75 MB/s, with ~300 MBit/s on both FreeNAS NICs.

I read so much about the ZVOL blocksize and iSCSI targets... At the end I used the default values of the GUI (without compression). But I think, I need to test another ZVOL blocksize for ESXi and 4K for the Windows Servers. Will report back tonight.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Add more RAM. iSCSI is about the worst possible workload for FreeNAS.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
On top of that, get CPUs that we actually recommend. The reason I don't recommend Celerons but Pentiums on the low-end is because Pentiums have 3MB of L3 cahe while Celerons have 2MB.

If you think you're going to run VMs on that machine, try going with 32GB of RAM minimum. As Ericloewe said, iSCSI for VMs is the absolute worst possible workload for ZFS. So you have to either accept the poor speed or counter it with more powerful hardware.
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
Thanks for the reply. I knew that I would be advised to get more RAM. ;)

At the end, VMs are not the real problem. They are mostly idle and don't need that much throughput, but till now I never experienced such "poor" iSCSI performance on an ESXi with MPIO. I expected the FreeNAS #2 to get up to 100 MB/s..
My real problem is the difference between read and write. Why can the Windows-Server write with about 200 MB/s but read never goes beyond 50 MB/s?

Don't get me wrong... I love FreeNAS. I just try to find the bottleneck. Yes sir, more RAM... ;) But is this the only thing slowing read down to 1/4 of write speed?
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
To update my own results:

Next to the ZVOL iSCSI targets I just added a file extend... what should I say:

Windows:
Write: ~130-150 MB/s, CPU ~20-40%, both NICs ~600-800 MBit/s
Read: ~100-120 MB/s, CPU ~20-40%, both NICs ~400-700 MBit/s

(These results are a mix of file copies, attoBench and CrystalDiskMark. e. g. atto was at 220 MB/s write and 115 MB/s read. I think the middle of all is a reliable value)

for my setup the file extends seem to fit better. Tonight I will test file extends with the ESXi. (I thought ZVOLs would be run better with 9.3. But my setup seems to love file extends)
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
So here I go (did also ZVOL tests again to compare the results):

ESXi 5.1 file extent:
Write: ~75 MB/s, CPU ~20-40%, both NICs ~300 MBit/s
Read: ~81 MB/s, CPU ~10%, FreeNAS only showing 1 NIC with ~600 MBit, ESXi shows both NICs ~340 MBit/s

ESXi 5.1 ZVOL extent:
Write: ~43 MB/s, CPU ~20-30%, both NICs ~180 MBit/s
Read: skipped. VMs need to get back online.

Will go with the file extents.
Thanks for all replies. Looking forward to build a new system with more RAM ;)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
At the end, VMs are not the real problem. They are mostly idle and don't need that much throughput, but till now I never experienced such "poor" iSCSI performance on an ESXi with MPIO. I expected the FreeNAS #2 to get up to 100 MB/s..
My real problem is the difference between read and write. Why can the Windows-Server write with about 200 MB/s but read never goes beyond 50 MB/s?

Don't get me wrong... I love FreeNAS. I just try to find the bottleneck. Yes sir, more RAM... ;) But is this the only thing slowing read down to 1/4 of write speed?

Great. Fantastic news.... we don't care about throughput. The RAM isn't to increase throughput directly. It's to cache I/O so you *can* get lower latency of your pool (and as a result have the potential for higher throughput).

I'm sorry, I've been over this conversation with VMs dozens of times and I'm sorry but I'm not going to hash this out again. I know I'm not even the only one to hash this out. If you want to run VMs on a ZFS based system, you're going to need resources. Lots of resources. That means spending money. For most people, they will be disappointed. ZFS protects your data at any and all costs. The way you counter that "at all costs" is spending more money to offset those performance problems.

Writes are *always* cached in RAM with iSCSI unless you've deliberately forced sync writes. So writes will *always* be faster because writing to RAM *is* hella fast. Reading from the pool is your bottleneck, and you counter it with more cache.

Oh, and if you read up on how Crystalmark works, you'll find that it writes zeros. If you read up on FreeNAS you'll see that lz4 compression is enabled by default. So that 1GB or so of zeros that crystaldisk wrote and read back for benchmark throughput was fast because it actually wrote about 100KB to the pool. It also had to cache only about 100KB, so it was a test that you were *guaranteed* to get amazing performance with. You are lying to yourself with the whole file-based extents vs zvol based extents because the tests don't do what you think they do and you don't realize it.

The devil is in the nitty gritty details. Attention to detail is 100% key to getting answers that mean something.

I guarantee you the purchase price of the FreeNAS software that if you did real world benchmarks that actually meant something you'd find zvols blow away file-based extents in 9.3. I see it at least once a week. :)
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
Again, thanks.

Like I said. The 3 VMs are most of the time idle. I can live with "less" throughput compaired to high-end systems.
The "dd"-tests with the ESXi is faster with file extents. Correct me if I'm wrong, but if I write 20 or 40GB with dd, my amount of cache is way far too less to cover the "performance test"?! (For now, the topic VM is solved for this setup)

I realized the thing with CrystalDiskMark. There is also the option to use "random" data. AND compression is disabled on my pools and all ZVOLs/Datasets ;)
I understand your point of lying to myself, but when getting more MB/s in real file copy this seems to be a little step in the right direction.

I was hoping to get some hints other than "more RAM"...
But this is only an option for a new system.:(
 
Last edited:

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
In addition to what's already been said, you should run striped mirrors instead of RAIDz2. And keep them them less than 50% full.


Sent from my phone
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
In addition to what's already been said, you should run striped mirrors instead of RAIDz2. And keep them them less than 50% full.

Had this before, but 4 drives in striped mirrors = RAID10. Don't hit me for this: performance was so terrible bad, that I moved to RAIDZ2.
I needed more performance than with RAID10, but not that much cyberjock seems to be used to ;). For now, performance seems quite stable and good for this setup.
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
RAIDZ2 can give good single-threaded sequential read/write speed in MB/s, but such patterns are not very typical to VMs. RAID10 on the other hand can give much better IOPS (proportional to number of mirrored groups for write and to number of disks for read), that is more useful for multi-threaded load.
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
I really appreciate all replies. But like I said, VMs are not the most important. File based usage is much more in this environment (Windows environment).
If the ESXi has 75 or 100 MB/s does not matter. But the 50 MB/s at the beginning were much too less for what I've seen in similar environments...
 

c0re

Dabbler
Joined
Feb 11, 2013
Messages
26
RAIDZ2 can give good single-threaded sequential read/write speed in MB/s, but such patterns are not very typical to VMs. RAID10 on the other hand can give much better IOPS (proportional to number of mirrored groups for write and to number of disks for read), that is more useful for multi-threaded load.

I pretty much completely agree with this. I have 6 3TB drives on a Pentium G2120 platform with 16GB of RAM.

I used to run RAIDz2 and then started doing much more iSCSI to VMware and found the performance to be... lacking. Two weeks ago I backed up everything and rebuilt it in RAID10. Thus far I have been very happy with the performance. On my simple setup it's saturating 2Gb adapters in MPIO fairly easily on sequential tests. The gains on smaller non-sequential reads and writes have been drastic as well. I'm massively happier overall with the performance across various workloads with striped mirrors than in RAIDz2.

In fact, I'm not even sure why RAIDz2 gets recommend so often anymore. Having experienced such a massive difference on many workloads I'd probably only really use it for archiving purposes...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
RAIDZ2 is great for archiving as well as basic file sharing (aka home use). If you don't need ultra-high I/O then RAIDZ2 (and RAIDZ3) gives you superior cost per GB vs redundancy.
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
Comparing RAIDZ2 and RAID10 depends on your point of view. I don't expect those low end systems to serve several ESXi with a dozen of VMs. The main goal is to achieve a balance of performance and data "security". The RAIDZ2 makes me feel more comfortable in case of disk error or the unlikely case of loosing 2 disks at once.

I got the two boxes up and running smoothely, with a throughput which is acceptable in this environment. The iSCSI file extents are ok for me (yeah, ZVOLs would be a great deal, but the ESXi only runs those 3 VMs and wouldn't have any benefit of the VAAI features) and the overall throughput is now what I expected with this hardware.

Thanks for joining this conversation :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The nasty part with I/O latency is it doesn't take much to go from a situation where you have ms of latency to 30+ seconds. If you get enough latency you *will* start corrupting VMs. I've seen it PLENTY of times and I can tell you it really happens and it really sucks. ESXi gives you a warning at something like 20 seconds of latency, and it's funny because people assume it means ms of latency and not actual seconds.

Here at home I have a couple of "play" VMs. I tried to run them on a RAIDZ2 with 32GB of RAM (my main system.. in my sig). The problem: they kept getting corrupted because of excessive latency leading to discarded writes. ESXi will only cache writes for so long before deciding to discard them. Once you hit that threshold, life goes over the cliff pretty quickly.

It's *really* hard to swallow, but ZFS needs RAM, L2ARC, etc to get good performance (and end up with something that isn't a hop, skip, and a jump away from trashing your VMs). Trying to do VMs on ZFS is a "do it right or don't do it" because of the nastyness that can result. To make things worse, I've seen some people that thought they'd win by doing lots of snapshots. Well, when you have to combine snapshots someday that creates lots of I/O. I've seen quite a few people that had a VM with a dozen or more snapshots and the entire VM went up in smoke because partway through a snapshot merge process some writes were lost and the end result was a non-viable VM.

I've even got a system with 48GB of RAM and 3 vdevs. I run 2 VMs. A Windows 7 VM and a Linux Mint VM. Both are "appliances" for me and neither do much of anything except sit around. But I've had some times where I could hit 10+ seconds just trying to update one VM while the other was idle. When you view the ESXi latency chart it might sit at 2-5ms for days, and suddenly its a vertical line to 10+ seconds.

This is just meant as a warning. You might not care about performance (I didn't care about performance with my "play" VMs). But when they are constantly corrupting themselves that gets old and the whole purpose for the VMs goes out the window.
 

Hobbel

Contributor
Joined
Feb 17, 2015
Messages
111
@cyberjock:
You make me think twice about a topic I marked as finished. Damn. ;)
But in the end... Regarding ZFS, I think you have much more experience than I will ever have.
 
Status
Not open for further replies.
Top