Massively higher read latency on 5400 rpm drives?

deafen · Jul 31, 2019

I'm in the process of upgrading my raidz2 array from 3TB to 6TB drives by sequential resilvering. The existing drives are a mix of Seagate, Toshiba, WD, and Hitachi, all 7200 RPM. The replacement drives are all 5400 rpm WD Blue drives (WD60EZAZ, with the larger cache).

gstat is reporting that the WD drives are showing 10x the read latency of the others. da1-4 are existing 7200 rpm 3TB drives, da5-7 are the new WD blues, and da8 is the resilvering target.

My question is this: Why are the read latencies on the WD drives so much higher? That can't be accounted for by just the rotational speed change, can it? Or is it because of the mismatch in rotational speeds between the disks in the array, and it'll go away when they all match again?

Linkman · Jul 31, 2019

I believe the 6TB Blue drives are SMR, there are posts on this forum indicating that the Blue drives with the larger cache (256MB) are SMR drives, which will show that kind of latency.

jgreco · Jul 31, 2019

I think the EZAZ are SMR drives, not sure. It's probably not true read latency but rather write latency tying up the drive and causing delayed reads.

Johnnie Black · Jul 31, 2019

Linkman said:
I believe the 6TB Blue drives are SMR

They are, attention that some new REDs are also SMR:
https://www.ixsystems.com/community...ons-guide-discussion-thread.46494/post-536296

deafen · Jul 31, 2019

Okay, that makes a ton of sense, now that I've read up on SMR drives. Turns out I'm behind the times, who knew?

Once I'm done upgrading the array, would it be helpful to implement a SLOG device? That would eliminate the overwriting penalty on at least the ZIL stage of a write op, right?. I haven't felt the need to use one yet, but this is a different class of performance issue than I've dealt with.

jgreco · Jul 31, 2019

Only if you're doing sync writes. Better option would be to disable any sync writes. SLOG is always slower than simply disabling sync.

deafen · Jul 31, 2019

jgreco said:
Only if you're doing sync writes. Better option would be to disable any sync writes. SLOG is always slower than simply disabling sync.

Right, right. I've already got sync writes disabled, so no benefit to be had. And with 192GB and a 98% ARC hit ratio, there's not a ton of reading taking place anyway. Not going to concern myself with it.

Arwen · Aug 2, 2019

@deafen, since WD Blues are considered desktop drives, you might check head un-loading and loading, as well as spin down. That would kill performance. I have a 2TB 2.5" drive in my media server, and it was collecting far too many load/unload counts. So I disabled that function.

I wish disk vendors would make a NAS type 2.5" spinner in 2TB, and above, in 9.5mm height or less.

deafen · Aug 2, 2019

Arwen said:
@deafen, since WD Blues are considered desktop drives, you might check head un-loading and loading, as well as spin down. That would kill performance. I have a 2TB 2.5" drive in my media server, and it was collecting far too many load/unload counts. So I disabled that function.

It's even worse than that - WD has removed the ability to disable the idle timer from the firmware. I found that out after I had already committed to using these drives. So far the resilvering process has kept them from unloading at all - the load cycle count for the first one I put in is at 101, and it's been several days. The pool hosts a VMware datastore, so I will probably spin up a tiny Linux VM with a script that uses dd to make a single synchronous 512-byte write every 5 seconds. That should keep the entire array from hitting the 8 second threshold without impacting performance too badly ... although now that I think about it, that's going to interact with the SMR issues discussed above. Have to see how it plays out in practice, I guess.

As for NAS-type 2.5" 2TB SATA drives, I know they exist; NetApp and other array vendors use them for their capacity-optimized shelves. Although I know that NetApp works with the drive vendors to customize the firmware, so maybe they're just not available in the retail market.

OTOH, if you've got a spare PCIe slot,you could buy two of these and a bus adapter card for right around $200 ...

https://www.amazon.com/Intel-660p-1-0TB-80mm-978350/dp/B07GCL6BR4/ref=sr_1_7

Arwen · Aug 7, 2019

deafen said:
It's even worse than that - WD has removed the ability to disable the idle timer from the firmware. I found that out after I had already committed to using these drives. So far the resilvering process has kept them from unloading at all - the load cycle count for the first one I put in is at 101, and it's been several days. The pool hosts a VMware datastore, so I will probably spin up a tiny Linux VM with a script that uses dd to make a single synchronous 512-byte write every 5 seconds. That should keep the entire array from hitting the 8 second threshold without impacting performance too badly ... although now that I think about it, that's going to interact with the SMR issues discussed above. Have to see how it plays out in practice, I guess.
...

Skip the Linux VM and skip writing to the pool. Remember, you only want to cause the drive to reset it's activity timer. So. create a script that performs a head movement task on each drive, in sequence, once every 5 seconds. This skips the ZFS overhead, and should keep the drives loaded and spinning.

deafen said:
...
As for NAS-type 2.5" 2TB SATA drives, I know they exist; NetApp and other array vendors use them for their capacity-optimized shelves. Although I know that NetApp works with the drive vendors to customize the firmware, so maybe they're just not available in the retail market.

OTOH, if you've got a spare PCIe slot,you could buy two of these and a bus adapter card for right around $200 ...

https://www.amazon.com/Intel-660p-1-0TB-80mm-978350/dp/B07GCL6BR4/ref=sr_1_7

Nice, but my use case is for a miniature media server, that has a 2.5" 9.5mm high SATA slot. Right now I have a Seagate, (which is really from Samsung Spinpoint), that works great for it's purpose. I just worry that it's not designed to last years. Or that I will fill it up.

deafen · Aug 7, 2019

Arwen said:
Skip the Linux VM and skip writing to the pool. Remember, you only want to cause the drive to reset it's activity timer. So. create a script that performs a head movement task on each drive, in sequence, once every 5 seconds. This skips the ZFS overhead, and should keep the drives loaded and spinning.

I like that idea, but I'm not sure exactly how I'd go about it. Would "dd if=/dev/daX of=/dev/null bs=512 count=1" do it? In theory, that should force the head to move to the first sector of the drive, right? Or is there a simpler userspace command I'm missing?

Arwen said:
Nice, but my use case is for a miniature media server, that has a 2.5" 9.5mm high SATA slot. Right now I have a Seagate, (which is really from Samsung Spinpoint), that works great for it's purpose. I just worry that it's not designed to last years. Or that I will fill it up.

I gotcha. Yeah, going to be hard to find those. I'd maybe keep an eye out on eBay for spares from array vendors. Or go SSD, since media isn't a high-rewrite use case - here's one for $219. https://www.amazon.com/Samsung-Inch-Internal-MZ-76Q2T0B-AM/dp/B07L31K2MK/

Important Announcement for The TrueNAS Community.

Massively higher read latency on 5400 rpm drives?

deafen

Explorer

Linkman

Patron

jgreco

Resident Grinch

Johnnie Black

Guru

deafen

Explorer

jgreco

Resident Grinch

deafen

Explorer

Arwen

MVP

deafen

Explorer

Arwen

MVP

deafen

Explorer

Similar threads

Important Announcement for The TrueNAS Community.