Poor performance with certain SAS drives...

Status
Not open for further replies.

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Good day:

I've been building out my new FreeNAS box, and I've come across an interesting performance limitation. Here's the system config:

Chassis - Supermicro CSE-847E16-R1K28LPB (36-bay chassis with a 24-port SAS2 expander backplane up front and a 12-port SAS2 expander backplane in back)
Motherboard - Supermicro X9DRi-LN4F+
Processor - 2x Xeon E5-2670 (8C/16T, 2.6GHZ, each)
Memory - 128GB Samsung DDR3-1600 1.5V ECC Registered (16x8GB)
Boot Drives - 2x Intel 320 SSD, 40GB, mirrored (on rear backplane)
SLOG - 2x Intel S3700 SSD, 200GB, mirrored (on rear backplane - not yet configured)
Fast storage - 14x HGST HUS156045VLS600 450GB 15K SAS, 2-way mirrors (on front backplane)
Big storage - 6x HGST HUS724040ALS640 4TB 7.2K SAS, RAIDZ2 (on rear backplane)
HBA - LSI 9211-8i, flashed to IT mode, P20 firmware
OS - latest stable FreeNAS 9.3 build

Ignore the various RAID/ZFS configs, I haven't even reached that point yet.

While doing burn-in testing with the above drives, I observed some interesting performance behavior. The 15K SAS drives would cruise along about 100MB/sec read... but dropped down to ~24MB/sec write. The 7.2K SAS drives exhibited "normal" performance - with relatively equal read and write speeds, running between 120-200MB/sec.

The 15K drives spent some time in a NetApp, so they had the 520-byte sectors. I reformatted these to 512-byte, but suspected this could be an issue. So, I pulled all of the drives, leaving only the two boot drives installed on the rear backplane, and installed two drives I had around spare. These are brand-new, generic 512-byte sector, drives showing zero SMART issues and 2 accumulated start/stop cycles. One is a Seagate ST3300555SS (300GB 10K SAS), the other is a Fujitsu MAX3147RC (146GB 15K SAS).

Much to my surprise, I saw the same performance with these two new drives. Around 100MB/sec read, around 25MB/sec write.

I'm doing performance testing using simple DD commands:
dd if=/dev/zero of=/dev/daX bs=1M count=2000 (write)
dd if=/dev/daX of=/dev/null bs=1M count=2000 (read)

I've also done checks with iozone, and observed the performance charts in the FreeNAS GUI while running badblocks... all showed the same behavior and fairly similar performance numbers.

Doing reads from the boot devices, and reads/writes from the soon-to-be-SLOG devices, all SSDs, renders expected performance north of 200MB/sec. in both directions.

I'm kinda flummoxed at this point. I've moved the drives to different bays, no change. If it was a SAS vs. SATA thing, it would make more sense. But the 4TB SAS drives perform as expected.

Any thoughts would be greatly appreciated!
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Try upgrading the firmware on the SAS expanders.
Well, I figured there must be firmware out there, but I sure haven't had any luck finding it. Do you know where Supermicro (or LSI) hides it, or do I need to contact Tech Support in hopes they'll help?
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Hm, well, that's odd. Are your SFF8087 cables sufficiently lengthed that you could disconnect the cable from the rear expander and the front expander and exchange the connections? If suddenly the problem "moved" then we might get some hints here (i.e. it could be cabling or HBA port issues).
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
I will try later today; however, I experience the issue with drives connected to either the front or the back backplane. I've tried with just one drive in the system - once in each backplane - same behavior. I'll try it again with a single device connected.

I suppose I should also order a fan-out cable - with some careful arranging, I could connect one drive directly to the HBA and take the backplanes out of the mix.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Ah, your search skills are obviously better than mine! Unfortunately, it appears my backplanes are running the same firmware version present in this ZIP, so they aren't terribly out of date. I emailed Tech Support to see if they had any previous experience with this, or if a newer firmware version was available.
Not better. I just remember seeing that post, because I had the same problem as you in trying to find it. :smile:
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Ah-ha! The write caches were not enabled on the drives. Found the solution here:
https://forums.freebsd.org/threads/performance-problems-with-western-digital-sas-drives.40534/

In case that link goes away... to fix it via script:
Code:
#!/usr/local/bin/bash

for file in /dev/da? /dev/da??; do echo $file; camcontrol modepage $file -m 0x08 $file|grep WCE; done
for file in /dev/da? /dev/da??; do echo $file; echo "WCE: 1" | camcontrol modepage $file -m 0x08 -e; done
for file in /dev/da? /dev/da??; do echo $file; camcontrol modepage $file -m 0x08 $file|grep WCE; done


And, for single drives:
To read the value: camcontrol modepage $DEVICE_NAME -m 0x08 -e
To write: echo "WCE: 1" | camcontrol modepage $DEVICE_NAME -m 0x08 -e
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
You win today's Persistence Problem Solver award. I had briefly thought about drive settings or firmware especially with the NetApp in the mix, but somehow I had gotten the impression it was the backplane that was causing the trouble.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Ha, persistent Googler award, maybe. I have not tried a reboot yet, that's next. Right now, I'm testing the fan speed issue... the fans plugged into the backplane run wide open, while the ones connected to the MB are properly managed. I moved all of the fans over to MB connectors - trying to see if the fans really need to be at 100% to keep the drives cool.

I'll report back after I try the reboot. If all else fails, I may have to figure out how to add a script at boot to enable the write cache.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I forgot to mention that I did that as well and it really helped with the sound. Temps were fine too. ~20-23C at idle (spundown) and ~30-33C running. Of course, since I don't have a MB in my jbod enclosure, I hooked the fans up to the little power controller card. Works the same though.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
I forgot to mention that I did that as well and it really helped with the sound. Temps were fine too. ~20-23C at idle (spundown) and ~30-33C running. Of course, since I don't have a MB in my jbod enclosure, I hooked the fans up to the little power controller card. Works the same though.

So, I decided to get the drives warm... started up two dd processes per drive, writing /dev/urandom continuously to every drive in the enclosure. 48 load average FTW. I let it bake awhile with the fan speed control set to "normal" (a/k/a wife-acceptable). The warmest drive was the top drive, second column... which hit 49C (before anyone has a stroke, the house is 72F... there's only so much I can do in the current testing environment... and these are 15K SAS drives, not your typical 7.2K SATA drives). After letting the drive heat-soak, I set the fan speed control to full (a/k/a screaming banshee). After an hour (good heat soak) with the dd processes continuing, the drives dropped off 2 degrees C. To me, that's statistically insignificant and an indication that the system is "over-fanned", at least in my build. If I had every PCIe slot filled, 15K drives in all 36 bays, etc. perhaps it wouldn't be. But, 2C isn't worth the difference from reasonably quiet to holy-crap. And I actually have one fan disconnected at this point... I'm one fan header short on the MB.

Once the remodel is completed on our new home, I will have a dedicated closet with its own HVAC, so I'll be able to feed the servers substantially cooler air and keep the drive temps more in-line. The 7.2K 4TB SAS drives never climbed above 42C in this test.

Back to the write cache issue, the change does NOT survive a reboot. So, I created /usr/local/bin/fix_write_cache.sh with the following contents:
Code:
#!/bin/bash
for file in /dev/da? /dev/da??; do echo "WCE: 1" | camcontrol modepage $file -m 0x08 -e; done

Then added it as a post-init script in the GUI. One more reboot and all is now well.
 
Status
Not open for further replies.
Top