Checking for TLER, ERC, etc. support on a drive

NASbox

Guru
Joined
May 8, 2012
Messages
644
@jgreco very nice writeup... Congrats.

Based on your write up, IIUC the same model of external hard drive might contain different models internally - did I get that correctly?

So if I was to buy 3 (as an example) WDBBGB0040HBK-NESN which is an 8TB drive, they could contain 3 different drives???

I'm likely going to have to get mine from CostCo. Good news is if I can interrogate the drive model over the USB3, and I don't like it, I can just take it back.

Also can I plug these into TrueNAS and run badblocks?

I think the WDBBGB0040HBK-NESN is what CostCo is currently selling, so if anyone can share any info on the model it would be much appreciated.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Based on your write up, IIUC the same model of external hard drive might contain different models internally - did I get that correctly?

So if I was to buy 3 (as an example) WDBBGB0040HBK-NESN which is an 8TB drive, they could contain 3 different drives???

I'll charitably say that you're likely only limited to two different kinds of drives. The buy I did two years ago on Black Friday ended up with this assortment in one machine:

<ATA WDC WD120EMFZ-11 0A81> at scbus3 target 9 lun 0 (pass2,da1)
<ATA WDC WD120EMFZ-11 0A81> at scbus3 target 10 lun 0 (pass3,da2)
<ATA WDC WD120EMFZ-11 0A81> at scbus3 target 11 lun 0 (pass4,da3)
<ATA WDC WD120EMAZ-11 0A81> at scbus3 target 12 lun 0 (pass5,da4)
<ATA WDC WD120EMAZ-11 0A81> at scbus3 target 13 lun 0 (pass6,da5)
<ATA WDC WD120EMAZ-11 0A81> at scbus3 target 14 lun 0 (pass7,da6)
<ATA WDC WD120EMAZ-11 0A81> at scbus3 target 15 lun 0 (pass8,da7)
<ATA WDC WD120EMAZ-11 0A81> at scbus3 target 16 lun 0 (pass9,da8)
<ATA WDC WD120EMFZ-11 0A81> at scbus4 target 0 lun 0 (pass10,da9)
<ATA WDC WD120EMFZ-11 0A81> at scbus4 target 1 lun 0 (pass11,da10)
<ATA WDC WD120EMFZ-11 0A81> at scbus4 target 2 lun 0 (pass12,da11)
<ATA WDC WD120EMFZ-11 0A81> at scbus4 target 3 lun 0 (pass13,da12)

The Reddit datahoarder guys have methods figured out to take a guess based on serial numbers that you can find on the box. I don't expect that to be super-reliable, but maybe better than nothing. I haven't checked your proposed part number but make certain that it isn't an SMR drive -- most 8TB's are now SMR. There's a guide in the Resources section that covers this.
 
Joined
Oct 22, 2019
Messages
3,579
Good news is if I can interrogate the drive model over the USB3, and I don't like it, I can just take it back.
With the external drives I shucked (8TB WDs, MyBook and Elements), I had 100% prediction using smartctl over a USB connection to assess the internal drive (the "real" model, so to speak), and even reveal the helium parameter (22) for He-filled drives.
 
Joined
Oct 22, 2019
Messages
3,579
I haven't checked your proposed part number but make certain that it isn't an SMR drive -- most 8TB's are now SMR.
How is that possible?

No WD Blue nor Red exceed 6TB, and all of their Red Plus and Red Pro are CMR (regardless of capacity.)

From which excess stock would they shove into their plastic USB enclosures for their 8TB+ external drives that would originally have been SMR internal drives?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
How is that possible?

No WD Blue nor Red exceed 6TB, and all of their Red Plus and Red Pro are CMR (regardless of capacity.)

From which excess stock would they shove into their plastic USB enclosures for their 8TB+ external drives that would originally have been SMR internal drives?

Well, I guess I don't really care, it's better to check the part and verify it. I know Seagate's 8TB ST8000DM004 is SMR along with some others. I also know that WD Red absolutely comes in capacities greater than 6TB; an easy example is

51Hi+zL3szL._AC_SY450_.jpg


so even if they've decided to relabel NEW drives with "Plus" and "Pro" designations, that doesn't change existing stock. And there are also 10TB SMR drives out there, though you are not likely to have one.

I do not think it is a bad thing to advise people to check, both to detect issues with their current inventory, but also to educate for future purchases.
 
Joined
Oct 22, 2019
Messages
3,579
I also know that WD Red absolutely comes in capacities greater than 6TB; an easy example is
That was before "SMR gate". :wink: Like you mentioned, they've since re-labeled such drives as "Red Plus" now. So even back then, an 8TB+ purchase of a WD drive is practically a given for CMR.

But I agree, might as well develop a habit of always checking and verifying. Who knows what tricks they might pull in the future.

I'm 4/4 with shucking WD 8TB externals and getting all CMR (two are He-filled are run cooler than their counterparts). :cool:
 

Alex_K

Explorer
Joined
Sep 4, 2016
Messages
64
Wouldn't we want in RAID that timeout to be zero, as in error encountered - mark it unreadable and go on, let ZFS read missing data from other device / recompile from sums and write somewhere else? Why wait 8 seconds?
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Wouldn't we want in RAID that timeout to be zero,

That's not how the real world works though. It seems like a valid question until you think about what's going on a bit.

First off, there are two high level "timeout" things that are going on here. I want to be clear that these are interrelated in some ways but also independent.

The first is a transaction timeout (going by various names) when a RAID controller issues a disk read request. It has a queue for these, and (not going into gory detail) it basically has an idea of what time it issued a request, so that it can timeout a request that is taking "too long". This lets the controller decide to fail over to the other drive, or parity, or whatever. That's kind of a high level function, and the controller is managing this for multiple queues to multiple drives over multiple channels.

The other is the drive itself. The controller on the drive has its own queue of transactions to run, see "NCQ", and may have a whole bunch of these stacked up at any given time, usually up to 31 transactions. But the thing is, if you enter the queue as the 31st transaction, and you can manage a maximum of 100 IOPS, even under ideal circumstances you're going to be waiting nearly half a second for that answer to pop out. This is completely normal.

But the real problem is when something does NOT go right. The drive cannot find the track, or the sector reads badly, and retries come into play. Not only does this screw with the transaction being processed, but it also screws with ones behind it in the queue. Plus, often, if one sector is bad, the ones near it may be bad too.

So we never want the timeout to be zero, because that would really mean we might never read anything from any drive, because physics means there's some latency in any read operation.

Where redundancy is available, yes, you can rapidly fail over to another disk or parity, but the individual devices have to have some guidance as to that abandonment of effort being desirable. That's that TLER/ERC is all about.

In practice, unless you are mission critical space shuttle launch data must flow, an I/O retry of several seconds is the general consensus for acceptability. Which particular single digit number is acceptable depends on the vendor, but 8 is a common default. The flip side is that if there's a meta-issue, like a power brownout or vibration, you don't want to be too anxious to reject all your results.
 
Top