Help with Degraded Zpool.

urbanrace6

Cadet
Joined
Sep 27, 2020
Messages
5
if anyone can help I would appreciate it. I recently replaced a Drive that was failing smart test prior to my raid degrading. after replacing the drive it has been continuously resilvering, after it finishes it begins again. I now have another drive showing up as degraded that I did not having a problem with prior. Any help or suggests appreciated.
1601220216338.png
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Can we start with you posting full system specs, please?

Of particular interest will be the makes and models of the HDD's (to determine if you have SMR drives), and the motherboard and any HBA in the system. but don't limit your list to those...
 
Last edited:

urbanrace6

Cadet
Joined
Sep 27, 2020
Messages
5
Hi, thx for the reply. I have been running this rig for about 5 Years now. I've only had to change one other disk in the past that went without a problem. This recent endeavor is much different. The Motherboard is a super micro MATX workstation grade board that supports ECC Memory, and Xeon. I have a Xeon X3470 CPU with 12 Gigs of DDR3 ECC Memory. I have five 2TB Hitachi Enterprise drivers in a Raid 5 Zpool. The setup has been running healthy for about 2 years now since the last HDD replacement. I've been basically purchasing the exact same Hitachi drives as replacements.
I purchased a drive to replace one drive failing smart test, but with a still healthy Zpool. I turned off my freenas yesterday to replace the failing drive. I did just that and replaced the drive in the pool status area. It ran till completion and upon that, it went degraded again showing another drive in a degraded state, but it passes smart test... the error reported under status -v indicates a permanent error with the metadata 0x124. I deep running the resilving but the status of the pool refuses to change.

The above picture is the status that just won't resolve. I'm currently in the process of copying all my data off the zpool incase I need to completely break it, or it breaks on its own... Ideally I would like to fix this pool without breaking it.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Sorry for this question, but are you sure that you replaced the correct drive?

Assuming yes, and a scrub of the pool does not clear this issue, then, as it looks like you know, the historical way to fix this is to backup your data, delete the pool and start over.
 

urbanrace6

Cadet
Joined
Sep 27, 2020
Messages
5
Yeah, it's the right disk... Serial Number... I'll try to run a scrub after the Resilver completes. Hopefully that works... Otherwise I'm already kind of prepared for a weeks worth of work... That I really hope I don't have to do.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
With a corrupt file/metadata, the resilver will never think that it's finished, so you'll be stuck like that until the corruption is removed.

Since it's in Metadata, you'll have a tough time working out what to get rid of. You may need to back up and destroy your pool and restore your data to a recreated pool.
 

urbanrace6

Cadet
Joined
Sep 27, 2020
Messages
5
Yes, thank you for validating this... My array is 8GB's 6 of which was filled. I've moved all my data off at this point, took a few days... Half the Data was just VM's shared out via NFS to my ESX servers... Those have proven the greatest difficulty to move. risking Corruption of my VM's that have been manually moved, I'm now prepared to destroy the array and rebuild it. What's more is the 5Bay drive enclosure I purchased years ago has 2 Drives stuck in it, lucky these drives are still good, but the latches no longer work. I guess over time, heat and moisture has caused the plastic housing to warp a bit around the drive, just enough to cause them to get stuck. The problem is at the two far ends of the enclosure. Kind of Sucks... almost as bad as fixing a corrupt active directory.
 
Top