Second drive failed during resilvering process

Status
Not open for further replies.

someone1

Dabbler
Joined
Jun 17, 2013
Messages
37
Hello All!

I'm running FreeNAS 8.3.0 with 5x3TB in RAIDZ1 configuration. One of my drives failed and I finally got around to RMAing it. This is for my own home server, so its not "mission critical" but it has a ton of information I'd rather not lose.

During the resilvering process, another HDD failed (started clicking). So it looks like some of my data is accessible, but there are a ton of errors remaining on the disk and entire datasets are inaccessible. I'm trying to identify the files affected and seeing if I have backups of them, but it looks as if I can't finish the "detach" of the original failed drive. It complains about insufficient replicas exist. That drive was already offline'd before I replaced it, so its not like that drive is being used as a replica anyway?

Even if I replace the second failed drive, I am sure that my missing data is gone. My question is, what do I do about the first drive I tried replacing? It still shows it as offline, but I can't detach it. I'm tempted to copy all the data off of the drives and wipe out the array and rebuild it from scratch. I'd probably also run each drive through SeaTools to make sure there aren't any more issues with any of them (is there a built-in tool in FreeNAS to do a sector-by-sector scan and attempt to fix it?)

Wondering if there's a way to try and recover my current array or just copy what I can and pull the rest from backups (or re-create them) and start from a fresh install.

P.S. It is my own fault for waiting so long before I decided to replace the failed drive (talking months here). This is by no means a poor reflection on FreeNAS or ZFS, just a result of my laziness.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Even if I replace the second failed drive, I am sure that my missing data is gone. My question is, what do I do about the first drive I tried replacing?
You don't need to worry about the first drive as the second one failed/is failing which means you get to recreate the entire pool. Depending on how badly the second disk is failing you can try to make a block copy of it to a new disk using ddrescue, but it's sounding like that won't get you very far here.

P.S. It is my own fault for waiting so long before I decided to replace the failed drive (talking months here). This is by no means a poor reflection on FreeNAS or ZFS, just a result of my laziness.
Criminal laziness ;) and combined with not using a double-parity array. Good thing you have some backups.
 

someone1

Dabbler
Joined
Jun 17, 2013
Messages
37
Well I guess I'm lucky that the inaccessible data seems to be mostly backups of my running PC, which I can always just make another backup of. The majority of data is still accessible on the FreeNAS box, I guess even though the drive failed at some point during the resilvering process, it still thinks it completed, just with a bunch of errors. I did a ZFS scrub before the replacement and I had no errors.

And I wanted the extra space with Z1, it's hard to swallow losing 3TB let alone 6TB when you buy HDDs during the flooding in Thailand, though I guess if I have to rebuild this array, I'll get a WD RED (are those actually good as NAS drives?) to get a Z2 configuration going. I currently run 5 x Seagate ST3000MD001 drives. (well 4 now until I get another one replaced!)

So it looks like I need to rebuild the array. Its going to take a good long while to backup the NAS data I don't have backups of... Thanks for the help!
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
So it looks like I need to rebuild the array. Its going to take a good long while to backup the NAS data I don't have backups of... Thanks for the help!
Yup, it sucks. Sure you lose another drive to parity, but it would have saved you from having to recreate the pool. That being said a single-parity pool with backups > a double-parity pool without backups.
 
Status
Not open for further replies.
Top