Is a faulty disk with 34 errors really DEAD, should I replace it?

titust1

Explorer
Joined
May 10, 2022
Messages
66
I use Truenas SCALE latest version. I noticed that one disk was labeled FAULTY because it had a few errors.
I mean 34 errors it does not mean a completely dead drive. Actually this drive was new purchased from Amazon (WD Red 4GB) two weeks ago
I can't believe that after two weeks it became faulty.
I'm starting to have doubts about the Truenas errors reporting. For Truenas he disk is not functional and I have to replace it. I don't see another way
There are no error correction mechanisms built, there is no sector isolation. All modern OSes are able to put aside the defective sectors and keep the drive functional. I ran the short and medium test in Truenas and no errors
I'm pretty sure that if I take the disk in Windows and format it. it will work just fine,
I am not in an enterprise environment. Is there a way bring this disk back to life? Reformat it, I don't know
What if a disk has 1 error in TN Scale will be declared FAULTY ? Should one replace a disk with one error?
It sound ridiculous
Any ideas? Am I doing something wrong? Maybe there is away
Thanks

Screenshot 2023-10-21 185019.png
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I can't believe that after two weeks it became faulty.
Why can you not believe this? Have you not heard of infant mortality?
All modern OSes are able to put aside the defective sectors and keep the drive functional.
This isn't a function of the OS; it's a function of the drive.
 

titust1

Explorer
Joined
May 10, 2022
Messages
66
Why can you not believe this? Have you not heard of infant mortality?

This isn't a function of the OS; it's a function of the drive.
So in conclusion I have to replace the disk.
It's OK I'm going to fight with WD to prove them the disk has errors. Maybe they believe me, and replace my drive.
BUT one more question... Is there a way to use a previously used on Windows disk, initialize / format and then use it in my Truenas box?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I ran the short and medium test in Truenas and no errors
I'm pretty sure that if I take the disk in Windows and format it. it will work just fine,
I am not in an enterprise environment. Is there a way bring this disk back to life? Reformat it, I don't know
What if a disk has 1 error in TN Scale will be declared FAULTY ? Should one replace a disk with one error?
It sound ridiculous
Any ideas? Am I doing something wrong? Maybe there is away
Thanks

View attachment 71460
A few things in sparse order:
  • there are very different kind of errors that mean a lot of different things
  • you have to run a long smart test, not short nor "medium" (whatever it is, I guess conveyance?)
  • depending on the type of errors, the disk might be fine and you can easily fix it (example, CRC errors due to improper latching of the connectors or bad cables)
So, run a long smart test then report back to us the results (and I don't mean wheter it says it has passed or not, but the actual values). You will have to use the shell (or ssh) to do so, tell us if you need guidance.

Also, please provide your hardware specs.

Maybe they believe me, and replace my drive.
Show them errors and they will ship you a replacement.

BUT one more question... Is there a way to use a previously used on Windows disk, initialize / format and then use it in my Truenas box?
The drive will be wiped by TrueNAS, and you will have to format again if you want to use with windows again.
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
WD has tools you can run to test the disk and give the results to them. if one tool reports bad sectors, they all should.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
This drive is likely an SMR drive and by that completely unsuitable for ZFS/TrueNAS anyway. Please use Google to check the model number for SMR.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
So in conclusion I have to replace the disk.
I can't address that one way or the other--I'm responding to your incredulity at that possibility. As I said, infant mortality happens, which is one reason you should thoroughly test hardware--especially drives--before putting it into production. But with that said, the page you're showing indicates pool errors, which may or may not result from disk errors.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
I would ask for results of smartctl -a /dev/sdc

If there are certain errors on that result, they WILL replace the drive. Even if it passes a smart test.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I would ask for results of smartctl -a /dev/sdc

If there are certain errors on that result, they WILL replace the drive. Even if it passes a smart test.
He has to run a long one first.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
He doesn't have to, any previous errors would already be recorded by the drive. Those errors don't necessarily come from tests, they come real time. But a test is also good as well.
 

titust1

Explorer
Joined
May 10, 2022
Messages
66
I can't address that one way or the other--I'm responding to your incredulity at that possibility. As I said, infant mortality happens, which is one reason you should thoroughly test hardware--especially drives--before putting it into production. But with that said, the page you're showing indicates pool errors, which may or may not result from disk errors.
Thanks a lot danb35, apparently, the cause of the problem was not the infant mortality (LOL) but the fact that the WD Red drive is an SMR drive not a CMR drive like the WD Red Plus. Mea culpa here, I was confused between the WD Red and the Red Plus and SMR vs CMR stuff... anyway
On the other side, I did resuscitate the dead infant WD Red as follows: After replacing the FAULTY WD Red disk with an IronWolf CMR disk, now my pool is up and healthy after resilvering... :)
The WD Red declared dead by Truenas was tested in Windows using WD troubleshooting tools and it displayed no errors.
Now it's up and running in my Windows machine just fine, So my intuition was OK
Thanks a lot guys for your help
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
I had a hunch ;-)
 
Top