Massively higher read latency on 5400 rpm drives?

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
I'm in the process of upgrading my raidz2 array from 3TB to 6TB drives by sequential resilvering. The existing drives are a mix of Seagate, Toshiba, WD, and Hitachi, all 7200 RPM. The replacement drives are all 5400 rpm WD Blue drives (WD60EZAZ, with the larger cache).

gstat is reporting that the WD drives are showing 10x the read latency of the others. da1-4 are existing 7200 rpm 3TB drives, da5-7 are the new WD blues, and da8 is the resilvering target.

Capture1.PNG


My question is this: Why are the read latencies on the WD drives so much higher? That can't be accounted for by just the rotational speed change, can it? Or is it because of the mismatch in rotational speeds between the disks in the array, and it'll go away when they all match again?
 

Linkman

Patron
Joined
Feb 19, 2015
Messages
219
I believe the 6TB Blue drives are SMR, there are posts on this forum indicating that the Blue drives with the larger cache (256MB) are SMR drives, which will show that kind of latency.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I think the EZAZ are SMR drives, not sure. It's probably not true read latency but rather write latency tying up the drive and causing delayed reads.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Okay, that makes a ton of sense, now that I've read up on SMR drives. Turns out I'm behind the times, who knew?

Once I'm done upgrading the array, would it be helpful to implement a SLOG device? That would eliminate the overwriting penalty on at least the ZIL stage of a write op, right?. I haven't felt the need to use one yet, but this is a different class of performance issue than I've dealt with.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Only if you're doing sync writes. Better option would be to disable any sync writes. SLOG is always slower than simply disabling sync.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Only if you're doing sync writes. Better option would be to disable any sync writes. SLOG is always slower than simply disabling sync.

Right, right. I've already got sync writes disabled, so no benefit to be had. And with 192GB and a 98% ARC hit ratio, there's not a ton of reading taking place anyway. Not going to concern myself with it.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
@deafen, since WD Blues are considered desktop drives, you might check head un-loading and loading, as well as spin down. That would kill performance. I have a 2TB 2.5" drive in my media server, and it was collecting far too many load/unload counts. So I disabled that function.

I wish disk vendors would make a NAS type 2.5" spinner in 2TB, and above, in 9.5mm height or less.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
@deafen, since WD Blues are considered desktop drives, you might check head un-loading and loading, as well as spin down. That would kill performance. I have a 2TB 2.5" drive in my media server, and it was collecting far too many load/unload counts. So I disabled that function.

It's even worse than that - WD has removed the ability to disable the idle timer from the firmware. I found that out after I had already committed to using these drives. So far the resilvering process has kept them from unloading at all - the load cycle count for the first one I put in is at 101, and it's been several days. The pool hosts a VMware datastore, so I will probably spin up a tiny Linux VM with a script that uses dd to make a single synchronous 512-byte write every 5 seconds. That should keep the entire array from hitting the 8 second threshold without impacting performance too badly ... although now that I think about it, that's going to interact with the SMR issues discussed above. Have to see how it plays out in practice, I guess.

As for NAS-type 2.5" 2TB SATA drives, I know they exist; NetApp and other array vendors use them for their capacity-optimized shelves. Although I know that NetApp works with the drive vendors to customize the firmware, so maybe they're just not available in the retail market.

OTOH, if you've got a spare PCIe slot,you could buy two of these and a bus adapter card for right around $200 ...

https://www.amazon.com/Intel-660p-1-0TB-80mm-978350/dp/B07GCL6BR4/ref=sr_1_7
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
It's even worse than that - WD has removed the ability to disable the idle timer from the firmware. I found that out after I had already committed to using these drives. So far the resilvering process has kept them from unloading at all - the load cycle count for the first one I put in is at 101, and it's been several days. The pool hosts a VMware datastore, so I will probably spin up a tiny Linux VM with a script that uses dd to make a single synchronous 512-byte write every 5 seconds. That should keep the entire array from hitting the 8 second threshold without impacting performance too badly ... although now that I think about it, that's going to interact with the SMR issues discussed above. Have to see how it plays out in practice, I guess.
...
Skip the Linux VM and skip writing to the pool. Remember, you only want to cause the drive to reset it's activity timer. So. create a script that performs a head movement task on each drive, in sequence, once every 5 seconds. This skips the ZFS overhead, and should keep the drives loaded and spinning.
...
As for NAS-type 2.5" 2TB SATA drives, I know they exist; NetApp and other array vendors use them for their capacity-optimized shelves. Although I know that NetApp works with the drive vendors to customize the firmware, so maybe they're just not available in the retail market.

OTOH, if you've got a spare PCIe slot,you could buy two of these and a bus adapter card for right around $200 ...

https://www.amazon.com/Intel-660p-1-0TB-80mm-978350/dp/B07GCL6BR4/ref=sr_1_7
Nice, but my use case is for a miniature media server, that has a 2.5" 9.5mm high SATA slot. Right now I have a Seagate, (which is really from Samsung Spinpoint), that works great for it's purpose. I just worry that it's not designed to last years. Or that I will fill it up.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Skip the Linux VM and skip writing to the pool. Remember, you only want to cause the drive to reset it's activity timer. So. create a script that performs a head movement task on each drive, in sequence, once every 5 seconds. This skips the ZFS overhead, and should keep the drives loaded and spinning.

I like that idea, but I'm not sure exactly how I'd go about it. Would "dd if=/dev/daX of=/dev/null bs=512 count=1" do it? In theory, that should force the head to move to the first sector of the drive, right? Or is there a simpler userspace command I'm missing?

Nice, but my use case is for a miniature media server, that has a 2.5" 9.5mm high SATA slot. Right now I have a Seagate, (which is really from Samsung Spinpoint), that works great for it's purpose. I just worry that it's not designed to last years. Or that I will fill it up.

I gotcha. Yeah, going to be hard to find those. I'd maybe keep an eye out on eBay for spares from array vendors. Or go SSD, since media isn't a high-rewrite use case - here's one for $219. https://www.amazon.com/Samsung-Inch-Internal-MZ-76Q2T0B-AM/dp/B07L31K2MK/
 
Top