Example of SMR ugliness, even on reads

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
I've been unwinding the mistake I made buying a pile of 6TB WD Blues, and noticed this interesting snapshot of just how badly these drives perform, even on read operations.

In the gstat output below, I'm replacing da14 with da1 (a 7200 RPM NL-SAS drive). Some points of note:
  • da10-da14 are SMR drives. da8 and da9 are PMR SATA drives.
  • The only drives that exhibit "delete" operations are the SMR drives. Delete ops are issued by the OS to tell the drive that certain blocks are no longer in use; I don't know why this matters for an HDD, and why only the SMR drives are getting these ops. Might be a good question for the zfs devs? Based on the associated latency, I assume that the drive is actually doing writes for these ops. Seems unnecessary.
  • Because of these slow delete ops (and destaging writes from cache, see below), read latency is awful, making ops stack up in the queue and causing %busy to stay very high. (They're typically all in the red.) Compare this to the %busy for the PMR drives, which don't have to do any housekeeping.
  • As far as the OS is concerned, write latency is negligible because it's all going to cache. But the extra work required to destage those writes slows down reads, so the real write latency is shifted into the read latency.
  • This is a resilvering operation, so these are all basically large, sequential ops, mostly 64K and 128K. Even so, the unpredictable behavior of the SMR drives keeps them throttled.
  • This is the first disk replacement resilver I've seen where the bottleneck is on the read side. Usually the rate is limited by the disk being written to. You can see that in this case, da1 can easily handle what's being thrown at it, and is spending most of its time starving on an empty queue, while the SMR disks struggle to read data fast enough. The PMR drives aren't even breaking a sweat.

Capture.PNG
 
Last edited:
Joined
May 10, 2017
Messages
838
Interesting stuff, thanks for posting, but do you know what is making the small writes to the pool? I would expect a resilver to be read only except on the deiver being resilvered, maybe this small amout of writes is enough to cause performance issues with SMR, I would expect no issues if it was really only reads.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
... do you know what is making the small writes to the pool?

Good question. There's an ESXi datastore mounted from it, but none of the VMs are powered on. smbstatus shows one client mounting, but no activity. Whatever it is, it looks like it happens about every 5 seconds (or maybe that's just the zfs txg sync?) Regardless, it's sporadic enough not to represent an ongoing problem, IMO.

Capture.PNG
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
I'll say this, though, when it doesn't have to deal with delete ops, it can be pretty damned fast. Still read-bound, though, and multiple times less efficient than PMR reads (compare latency and %busy between da10 (PMR) and da11 (SMR).

Capture.PNG
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
Whatever it is, it looks like it happens about every 5 seconds (or maybe that's just the zfs txg sync?)
That's likely what it is. Just having a ZVOL mounted as a VMFS datastore will cause a small amount of "heartbeat/liveness checking" to hit it. ZFS will copy-on-write the heartbeat LBA across the disk and hopefully not cause it to reshingle as frequently, but the obfuscation of the SMR behavior still means that the drive will still offer inconsistent performance overall.

I do appreciate the amount and detail of the data you're providing on these drives though. Certainly a case of making lemonade.

If you get the chance to free one of them from the array, try hitting it with an ATA_SECURE_ERASE and seeing if that "clears the deck" as far as reshingling/delete operations. Essentially, these drives need to support the SMR equivalent of TRIM. ;)
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,600
In my opinion, SMR drives are not suited for some applications, like zVols, small files, or write heavy applications.

For my case, I don't care how long my backup takes to my Seagate 8TB Archive SMR drive. As long as it's reliable, and ZFS lets me validate the backups via scubs.
 
Last edited:

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
If you get the chance to free one of them from the array, try hitting it with an ATA_SECURE_ERASE and seeing if that "clears the deck" as far as reshingling/delete operations. Essentially, these drives need to support the SMR equivalent of TRIM. ;)

Now that I've gotten them all swapped for PMR drives (see my post about replacing all the drives at once), I plan to do just that. Since I'm planning on selling them to my coworkers, I wanted to try to "reset" them anyway. I'll get to that later this week.
 
Top