Hi everyone,
I know this topic has been discussed a few times previously but there doesn't seem to be a consensus so I figured I'd throw in my experience. I have the following setup:
Supermicro X10SLM-LN4F motherboard
Intel Intel Xeon E3-1230 v3 CPU
32GB ECC RAM
LSI 9211-8I PCIe running IT firmware
10x3TB WD Red's
8 of the drives are connected to the LSI adapter and the other 2 are connected to the integrated motherboard SATA3 ports. It is setup as a single 10 disk RAIDZ2 pool.
My main focus is on read/write throughput and not IOPS or concurrency. I'm trying to evaluate if it's worth making an investment into 10GigE because 1GigE is currently my bottleneck.
Using a completely untuned "default" FreeNAS 9.1.1 or 9.2.1.2 install my local tests using dd are achieving about 225MB/s for both reads and writes. Looking at gstat I see the individual drives all hitting around 35MB/s very uniformly. Not abysmal but also not very impressive for the hardware I'm using.
In FreeNAS 9.1.1 if I set vfs.zfs.vdex.max_pending to 1 or in 9.2.1.2 if I adjust vfs.zfs.vdev.max_active (seems to have deprecated vfs.zfs.vdex.max_pending) to 1 and change nothing else my read and write speeds instantly more than double. Reads are in the 550MB/s range and writes are around 475MB/s. Looking at gstat again all of the drives are now sustaining over 70+MB/s. I see no negative impact in my typical usage and IOPS seem to remain exactly the same, although as I said that doesn't particularly matter to me.
What's bugging me is why this isn't impacting more people and figuring out what's unique to the handful of us that are seeing this behavior. Are WD Red's the common link? Perhaps the integrated motherboard SATA controller has some weird NCQ bug? I would love to take that out of the equation but I have data on the pool now so that would be a huge hassle.
I have checked the drives and cables and can't find any sign of anything being faulty. SMART reports are pristine and if I use dd to read and write from the raw devices (e.g. /dev/da0), they all shoot right up to advertised 150MB/s with no problem. Obviously I run that particular test when the drives were not part of a pool.
Some additional (somewhat just poking randomly) things I've tried after going back to the default value of vfs.zfs.vdev.max_active=1000 and trying to get the performance up:
1) I've used camcontrol to try various NCQ queue depths, including the minimum of 2 up through 32 for the integrated controller and 255 for LSI. Zero impact.
2) I've tried both disabling TLER and running the default value of 7.0s for reads and writes and nothing changes.
3) I've toggled all of the BIOS PCIe settings for power saving and such and it has made no impact.
Does anyone have any other suggestions or theories? I completely realize I could change hardware or redesign the pools to achieve greater performance -- I'm not asking for that -- to reiterate I'm just curious why it is so helpful in some setups and not others. Thanks..
I know this topic has been discussed a few times previously but there doesn't seem to be a consensus so I figured I'd throw in my experience. I have the following setup:
Supermicro X10SLM-LN4F motherboard
Intel Intel Xeon E3-1230 v3 CPU
32GB ECC RAM
LSI 9211-8I PCIe running IT firmware
10x3TB WD Red's
8 of the drives are connected to the LSI adapter and the other 2 are connected to the integrated motherboard SATA3 ports. It is setup as a single 10 disk RAIDZ2 pool.
My main focus is on read/write throughput and not IOPS or concurrency. I'm trying to evaluate if it's worth making an investment into 10GigE because 1GigE is currently my bottleneck.
Using a completely untuned "default" FreeNAS 9.1.1 or 9.2.1.2 install my local tests using dd are achieving about 225MB/s for both reads and writes. Looking at gstat I see the individual drives all hitting around 35MB/s very uniformly. Not abysmal but also not very impressive for the hardware I'm using.
In FreeNAS 9.1.1 if I set vfs.zfs.vdex.max_pending to 1 or in 9.2.1.2 if I adjust vfs.zfs.vdev.max_active (seems to have deprecated vfs.zfs.vdex.max_pending) to 1 and change nothing else my read and write speeds instantly more than double. Reads are in the 550MB/s range and writes are around 475MB/s. Looking at gstat again all of the drives are now sustaining over 70+MB/s. I see no negative impact in my typical usage and IOPS seem to remain exactly the same, although as I said that doesn't particularly matter to me.
What's bugging me is why this isn't impacting more people and figuring out what's unique to the handful of us that are seeing this behavior. Are WD Red's the common link? Perhaps the integrated motherboard SATA controller has some weird NCQ bug? I would love to take that out of the equation but I have data on the pool now so that would be a huge hassle.
I have checked the drives and cables and can't find any sign of anything being faulty. SMART reports are pristine and if I use dd to read and write from the raw devices (e.g. /dev/da0), they all shoot right up to advertised 150MB/s with no problem. Obviously I run that particular test when the drives were not part of a pool.
Some additional (somewhat just poking randomly) things I've tried after going back to the default value of vfs.zfs.vdev.max_active=1000 and trying to get the performance up:
1) I've used camcontrol to try various NCQ queue depths, including the minimum of 2 up through 32 for the integrated controller and 255 for LSI. Zero impact.
2) I've tried both disabling TLER and running the default value of 7.0s for reads and writes and nothing changes.
3) I've toggled all of the BIOS PCIe settings for power saving and such and it has made no impact.
Does anyone have any other suggestions or theories? I completely realize I could change hardware or redesign the pools to achieve greater performance -- I'm not asking for that -- to reiterate I'm just curious why it is so helpful in some setups and not others. Thanks..