Raid 10 or RaidZ2

jkiel · Aug 24, 2017

Currently configuring a 10 drive (2GB spindles) array with a 960GB NVMe SSD for L2ARC and 400GB Intel 750 NVMe SSD for SLOG. 128GB ram, 8 core Intel (forget which one.) 10gbE.

This will be used primarily for Vmware/Vsphere storage, using NFS. (IIRC, iSCSI is only faster because use uses unsafe async writes by default? Enabling sync on iSCSI makes it the same or slower than NFS when used with Vmware?)

I see it recommended often that Raid 10 is preferred over RaidZx for Vmware, however since everything will be written synchronously, going through the SLOG/ZIL, is there really any noticeable performance improvement over RaidZ2? Isn't the SLOG device (or network interface if SLOG is fast enough) the bottleneck for sync writes? What am I missing?

bestboy · Aug 29, 2017

Mirrors achieve more read IOPS than raidz.

A raidz vdev of n drives gives you the read IOPS of 1 drive.
A raidz vdev of n drives gives you the bandwidth of n - p drives (p being the number of parity drives. e.g. 2 for raidz2).

For VM hosting bandwidth is typically not an issue. Read IOPS of the disk sub-system on the other hand is an issue in case the workload does not fit into ARC/L2ARC.

genBTC · Sep 1, 2017

bestboy said:
A raidz vdev of n drives gives you the read IOPS of 1 drive.

Why is this the case ?

bestboy · Sep 3, 2017

Because of how raidz works. A vdev behaves like a single drive.

Let's consider a read from a pool having a single raidz2 vdev with 8 physical devices. When an application issues a read request of the vdev, it enters a queue and is then processed. In order to fulfill a single read request all 8 drives of the vdev have to do work. The 6 drives that contain a data block have to fetch it and the 2 drives containing a parity block need to issue a fetch, too. When all 8 drives executed their read operation and have their block, the original data is reconstructed by re-merging the 6 data blocks. Then the parity value is computed from the merged data block and compared to the 2 parity blocks read from the drives. If the parity matches, everything is OK. The merged data block can then be decompressed and returned to the application.
The important thing to note here is that all 8 drives have to do work in order to re-construct the original data. So 8 physical read operations will add up to a single logical read operation or

A raidz vdev of n drives gives you the read IOPS of 1 drive.

Mirror vdevs work differently. I you have a 2-way mirror vdev and you issue a bunch of read requests to it, then they first end up in a queue. And since each mirror device contains all the data and there is no parity to check, the read operation is simpler than in the raidz case. A single mirror device is asked to fetch all the data blocks to fulfill the read request. So in a mirror vdev only 1 of the devices have to do work to fulfill the read request. Now, since we have 2 mirror devices, we can dequeue another read request and let the second mirror device fetch those blocks while the first mirror device fetches the ones for the first request. The 2 physical devices act independently for reads which is why we can use the read IOPS of each drive. Hence, in contrast to raidz:

A mirror vdev of n drives gives you the read IOPS of n drives.

Stux · Sep 3, 2017

bestboy said:
Because of how raidz works. A vdev behaves like a single drive.

Let's consider a read from a pool having a single raidz2 vdev with 8 physical devices. When an application issues a read request of the vdev, it enters a queue and is then processed. In order to fulfill a single read request all 8 drives of the vdev have to do work. The 6 drives that contain a data block have to fetch it and the 2 drives containing a parity block need to issue a fetch, too. When all 8 drives executed their read operation and have their block, the original data is reconstructed by re-merging the 6 data blocks. Then the parity value is computed from the merged data block and compared to the 2 parity blocks read from the drives. If the parity matches, everything is OK. The merged data block can then be decompressed and returned to the application.
The important thing to note here is that all 8 drives have to do work in order to re-construct the original data. So 8 physical read operations will add up to a single logical read operation or

A raidz vdev of n drives gives you the read IOPS of 1 drive.

Mirror vdevs work differently. I you have a 2-way mirror vdev and you issue a bunch of read requests to it, then they first end up in a queue. And since each mirror device contains all the data and there is no parity to check, the read operation is simpler than in the raidz case. A single mirror device is asked to fetch all the data blocks to fulfill the read request. So in a mirror vdev only 1 of the devices have to do work to fulfill the read request. Now, since we have 2 mirror devices, we can dequeue another read request and let the second mirror device fetch those blocks while the first mirror device fetches the ones for the first request. The 2 physical devices act independently for reads which is why we can use the read IOPS of each drive. Hence, in contrast to raidz:

A mirror vdev of n drives gives you the read IOPS of n drives.

No. A mirror vdev still has the IOPS of a single drive. But since the vdev is only two drives wide, you probably have more vdevs and thus more IOPS.

bestboy · Sep 3, 2017

Stux said:
No. A mirror vdev still has the IOPS of a single drive.

Are you sure? AFAIK that's true for writes, but not for reads.

Constantin Gonzalez said:
When reading from mirrored vdevs, ZFS will read blocks off the mirror's individual disks in a round-robin fashion, thereby increasing both IOPS and bandwidth performance: You'll get the combined aggregate IOPS and bandwidth performance of all disks.

and

Constantin Gonzalez said:
So if you have a number n of disks to create a single vdev with, what would be the fastest vdev?

This one is tricky. For writes, the situation is as follows:

Mirroring n disks will give you a single disk's IOPS and bandwidth performance.

RAID-Z with one parity drive will give you a single disk's IOPS performance, but n-1 times aggregate bandwidth of a single disk.

Does it make RAID-Z the winner? Let's check reads:

Mirroring n disks will give you n times a single disk's IOPS and bandwidth read performance. And on top, each disk gets to position its head independently of the others, which will help random read performance for both IOPS and bandwidth.

RAID-Z would still give you a single disk's performance for IOPS, but n-1 times aggregate bandwidth of a single disk. This time, though, the reads need to be correlated, because ZFS can only read groups of blocks across the disks that are supposed to hang out together. Good for sequential reads, bad for random reads

source: http://constantin.glez.de/blog/2010/06/closer-look-zfs-vdevs-and-performance

Stux · Sep 3, 2017

bestboy said:
Are you sure? AFAIK that's true for writes, but not for reads.

and

source: http://constantin.glez.de/blog/2010/06/closer-look-zfs-vdevs-and-performance

Proof that you're probably right.
https://github.com/zfsonlinux/zfs/commit/556011dbec2d10579819078559a77630fc559112

Ericloewe · Sep 4, 2017

Reads are distributed inside a vdev, to some extent.

Important Announcement for the TrueNAS Community.

Raid 10 or RaidZ2

jkiel

Cadet

bestboy

Contributor

genBTC

Dabbler

bestboy

Contributor

Stux

MVP

bestboy

Contributor

Stux

MVP

Ericloewe

Server Wrangler

Similar threads

Important Announcement for the TrueNAS Community.

Raid 10 or RaidZ2

Cadet

Contributor

Dabbler

Contributor

MVP

Contributor

MVP

Server Wrangler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Raid 10 or RaidZ2"

Similar threads