Why are there two instances of a disk?

mysticpete

Contributor
Joined
Nov 2, 2013
Messages
148
Silly question maybe, trying to get a disk replaced and it's showing a disk da0 and a gptid reference, can anyone explain why this happens?


1711629760478.png
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
The gptid reference should be referring to the disk you already removed / offlined. This will be replaced with da0, resilvering is in progress.

I'm thrown off by the 1914 write errors on da0 though.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
da0 is your new disk, the gptid refers to the disk you're replacing: It is offline, but still a member of the degraded pool until the replacement is complete.

Having nearly 2000 write errors on the new disk is worrying. With a raidz1 and one drive missing, you're heading towards data loss…:eek:
 

mysticpete

Contributor
Joined
Nov 2, 2013
Messages
148
da0 is your new disk, the gptid refers to the disk you're replacing: It is offline, but still a member of the degraded pool until the replacement is complete.

Having nearly 2000 write errors on the new disk is worrying. With a raidz1 and one drive missing, you're heading towards data loss…:eek:

So the disk is very new, ran the burn in tests and everything and this was good, but 2 months down the road and it failed, I have wiped the disk with 0's and 1's and am trying to add it back in to see how it goes, but the system still shows errors?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
ZFS will report errors until these are cleared by the administrator.
I suppose that it could be the drive, the cable or controller but in any case you should be wary of replacing anything until you've identified the issue. Drives can pass burn-in and fail two months afterwards. What's the result of a long SMART test?
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
I have wiped the disk with 0's and 1's and am trying to add it back in to see how it goes, but the system still shows errors?
Wait, is da0 a different disk or the same (potentially) faulty disk your trying to add again after running bad blocks?

As @Etorix said, rule out any cabling issue and post the smart results.
 
Last edited:

Cellobita

Contributor
Joined
Jul 15, 2011
Messages
107
One thing you should definitely not overlook is your PSU. Over the past decades I've had hundreds of customer disks - new, old and everything in between - blown away by what turned out to be faulty power supply.
 

mysticpete

Contributor
Joined
Nov 2, 2013
Messages
148
Wait, is da0 a different disk or the same (potentially) faulty disk your trying to add again after running bad blocks?

As @Etorix said, rule out any cabling issue and post the smart results.
Same disk, I just wiped it clean, the last two long SMART results were failures.
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
Same disk, I just wiped it clean, the last two long SMART results were failures.
Wiping a HDD that already failed two smart tests (post the output please) will not heal it magically.
The correct course of action in my book would be:
1) I assume the failed smart tests indeed indicate a drive failure and cabling etc. is ruled out. @Cellobita isn't wrong, a bad PSU can also be a suspect although I would be leaning to CRC errors in that case.
2) Get a replacement disk ASAP/ If you don't have backups, backup your most important data now.
Depending on your warranty status the manufacturer may send you a new disk before receiving/checking the old disk.
3) This really depends on whether you have backups or can afford downtime, best thing would be properly burning in the harddrive.
You may take the server offline during that time or at least offline your pool to avoid stressing the remaining HDDs until you restored parity again.
4) Install the new HDD

For future reference, with Raidz1 and irreplaceable data with no backups: 1) Have Backups or at least a cold spare at hand.
 

mysticpete

Contributor
Joined
Nov 2, 2013
Messages
148
Thank you all for your input, will work on your recommendations.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
the last two long SMART results were failures.
That means the disk is dead. Replace it. I mean, really replace it. Wiping it clean and resilvering it back into your pool won't do anything for you if the disk is dead or dying, and its own self-test tells you it is.
 

mysticpete

Contributor
Joined
Nov 2, 2013
Messages
148
That means the disk is dead. Replace it. I mean, really replace it. Wiping it clean and resilvering it back into your pool won't do anything for you if the disk is dead or dying, and its own self-test tells you it is.

Funny thing is that it has passed a bad blocks test twice no with no errors?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
The patient was in good health until diagnosed with a fatal disease…
 
Top