Horrible Performance with FreeNAS as VM Storage

Status
Not open for further replies.

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
It's only "faster" because right now you're presenting a sequential workload. Parity RAID does that quite well.

As soon as you start to put a multi-VM load or any other kind of random I/O against it, your performance drops significantly.
Isn't this due more to the fact that it's a single vdev, not Raidz? IOW, if you had 10 mirror vdevs compared to 10 3-drive Raidz vdevs, I don't know if you would be able to tell the difference performance wise.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Isn't this due more to the fact that it's a single vdev, not Raidz? IOW, if you had 10 mirror vdevs compared to 10 3-drive Raidz vdevs, I don't know if you would be able to tell the difference performance wise.

Yes you would. The reads on the mirrors can be a lot faster, and the variable overhead lost to RAIDZ isn't there.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Isn't this due more to the fact that it's a single vdev, not Raidz? IOW, if you had 10 mirror vdevs compared to 10 3-drive Raidz vdevs, I don't know if you would be able to tell the difference performance wise.

I can tell you firsthand that 6 vdevs that are RAIDZ2 versus 6 mirrored vdevs in the same hardware in the same setup perform totally different for VM storage. The RAIDZ2 kicked the living crap out of the 6 mirrored vdevs if you do things like dd tests (things that are sequential), but as soon as you started trying to do highly i/o intensive work, the mirrors quickly polished off workloads while the RAIDZ2 was working hard. If memory serves me right the sequentials of the RAIDZ2 was more than double the mirrors, but the mirrors did something like 7.5x more iops.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I can tell you firsthand that 6 vdevs that are RAIDZ2 versus 6 mirrored vdevs in the same hardware in the same setup perform totally different for VM storage. The RAIDZ2 kicked the living crap out of the 6 mirrored vdevs if you do things like dd tests (things that are sequential), but as soon as you started trying to do highly i/o intensive work, the mirrors quickly polished off workloads while the RAIDZ2 was working hard. If memory serves me right the sequentials of the RAIDZ2 was more than double the mirrors, but the mirrors did something like 7.5x more iops.

But one should note that that's highly dependent on the vdev composition. And the reason all this is true is obvious once you look at what's going on under the hood.

RAIDZn is optimized for the ZFS model. It isn't real "RAID5" style parity, where there's a precomputed knowledge of where the parity blocks are, and therefore a RAID5 system might have to read blocks in order to be able to write blocks. As a combined filesystem and storage manager, ZFS has a different strategy which basically boils down to "write parity blocks where convenient to the current operation." This is the brilliant part of the ZFS dynamic stripe size.

For a ZFS RAIDZ3 vdev, and a 128K block write such as what might happen during a sequential write of a large file, all ZFS has to do is locate 35 contiguous sectors (32 for data, 3 for parity) and it just streams them out to the pool.

For a ZFS mirror vdev, and a 128K block write, ZFS needs to write 32 sectors to each disk, for 64 sectors written.

Mirror loses this one. Each mirror disk has to take the full brunt of the full size of the write, while for the RAIDZ3, each disk is only writing a handful of sectors.

For a ZFS RAIDZ3 vdev, and a 4K block write, what is that doing? A single data block write and three parity blocks? That's the sucky part of the ZFS dynamic stripe size. Ow, ow, ow. It's eating tons of space on the pool and writing to four devices for each individual 4K block stored.

For a ZFS mirror vdev, and a 4K block write, ZFS needs to merely blat out two sectors, one to each disk.

Mirror wins that. It's more space efficient AND writes less to the disks.

For a ZFS RAIDZ3 reading a 128K block, you get to go read the data off the disks. There's only one copy. So since 32 sectors need to be read, across all of your vdev disks (you don't have something wider than 32 disks!), all the disks in the vdev are commanded to go to the given stripe and read that data. This is inherently inefficient because even though each drive is an independent mechanism, they are effectively kicking their read heads around like a bunch of Broadway dancers kicking their legs in unison in a lineup.

For a ZFS mirror reading a 128K block, only one component needs to move its heads to read the data. The other disk can actually be serving some other transaction. Parallelism at work.

Mirror wins that. Mirror can be doing twice (or thrice, if you have a 3-way mirror) the work.

For a ZFS RAIDZ3 reading a 4K block, though, this is a tough one. You only really need to involve a single disk. In theory you could get a whole bunch of parallelism out of this, if there's other work to be done.

For a ZFS mirror reading a 4K block, it's the same as above: only one component needs to move its heads to read the data. The other disk can actually be serving some other transaction.

It isn't clear that there's a winner here. In theory, RAIDZn could win that because there's an opportunity for more parallelism. In practice, though, it doesn't seem like this gets its moment in the spotlight, because RAIDZ3 ends up hurting so badly for other operations. The mirror vdev has a much greater consistency to its performance.

In the end, the mirror vdev is much more consistent in its performance characteristics. It will read at up to twice the speed of the underlying disks, is capable of up to twice as many read IOPS as the underlying disks, and write IOPS and speed closely resembles a single component device.

Meanwhile, RAIDZ3 is heavily influenced by the block size of the given transaction. There are obvious avenues for wins, but also paths to suck.
 
Status
Not open for further replies.
Top