Two drives died in RAIDZ1 setup - completely screwed?

Status
Not open for further replies.

brossow

Dabbler
Joined
Mar 17, 2012
Messages
15
I've got a homebuilt FreeNAS 9.3 system with five 750GB drives in a RAIDZ1 array (built almost exactly 4 years ago). One of the drives started making the click of death a few days ago, so I immediately ordered a new drive to replace it; it arrived yesterday and it was to be my first order of business today. However, now a second drive has started to do the same thing before I've had a chance to replace the first bad drive and the system stopped responding. I've tried shutting the system down for a while and restarting, freezing the drives and all the usual stuff we do with clicking drives that rarely works.

My question is whether I am completely and utterly out of luck, or if there's any chance of recovering anything. The system won't start up with either drive connected, and with both disconnected the system starts but the volume is in an UNKNOWN status. I simply don't have enough experience to go further without help.

If I'm just SOL without any hope beyond physical drive repair, please say so. Lectures won't help at this point, and no one knows better than I do that I should have replaced the drive last night as soon as the new one arrived instead of waiting until today. I wouldn't have thought for a moment that I'd be so unlucky as to lose two drives within a span of a couple days after being completely trouble-free for years. :(

Thanks,
Brent
 
Last edited by a moderator:

jdong

Explorer
Joined
Mar 14, 2016
Messages
59
Sorry to hear :(. Unfortunately, losing two disks in a RAID-Z1 VDEV is complete pool loss. The best thing you can hope for is maybe there's something wrong with your power supply such that the drives are actually good and it's the computer that failed. The chances of that are low.

That's really bad luck, because the size of your drives (less than 1TB) falls below the size where the community stops recommending RAID-Z1.

If you rebuild your pool, I would highly recommend considering RAID-Z2 or mirrored striped VDEVs to lower your risk in the future.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
mod note: edited to reflect RAIDZ1 rather than RAID5
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That SUCKS.

But...! Before you go and totally panic, carefully take the failed drives over to a standalone PC and see if maybe they aren't actually totally dead. In particular, if you could get one of them to copy onto the new drive, you would be "okay".
 

brossow

Dabbler
Joined
Mar 17, 2012
Messages
15
Thanks for the quick replies (and correction about the RAID array). I've put each drive individually into an external dock and they both continue to click. I figured that I was pretty much out of luck, but thought I should ask. If there's a bright side to this, it's that I at least copied my wife's data back to an external hard drive the same day drive #1 started clicking. I'm sure I've lost some stuff permanently, but much of what I had on the NAS was copied from other old hard drives that I still have sitting around in boxes, and ultimately it probably isn't anything that will impact my life in the long run. It's just a bit gutting regardless. At least it wasn't my photos, which I have backed up already but am backing up to yet another drive as I type this, just in case -- I've learned the hard way that drives can go bad quickly and in short succession.

Any other advice is still welcomed. I'm going to sit on these for awhile and not dive immediately into building a new array.
 
Last edited:

brossow

Dabbler
Joined
Mar 17, 2012
Messages
15
Followup question just on the very remote chance one of the drives isn't completely dead: If one of them did manage to come back to life long enough that I could duplicate one onto a new (bigger, better, etc.) drive with one of those external dual docks that can replicate a drive without a computer, would I be able to just duplicate onto a large drive, install the new drive in place of the old one, and be good to go (that is, long enough to use the degraded array to recover data), or is it not that simple? I have no expectation that either dead drive will be recoverable this way, but if one of them is I'd like to know that this would work. Last thing I want to do is buy another one of these small, not to mention crap, Seagate drives.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
As long as it copies the entire thing including the partition table without trying to do anything "smart", I'd expect it to work.
 

brossow

Dabbler
Joined
Mar 17, 2012
Messages
15
Thanks. I think I'll order one of those "cloning"-capable docks on the off chance one of these drives can be resurrected, and if not it would be handy to have anyway (as if I don't already have enough external docks).
 

jdong

Explorer
Joined
Mar 14, 2016
Messages
59
Also don't forget about a Linux Live distro with a copy of the "dd_rescue" tool, which does a better job of skipping around defect areas, etc.

I've also heard that cooling dying drives can breathe a little new life into them. But now you're in the world of voodoo -- ZFS in theory is more resilient if you're able to partially recover different parts of each failing drive.
 
Joined
Apr 9, 2015
Messages
1,258
It is possible that you MIGHT be able to resurrect a drive to get the data off. It will be a shot in the dark but sometimes the control boards will be at fault when the click starts happening when a drive is dead. This can be a bit of a cost but if you really need to recover it is worth a shot http://www.hdd-parts.com/ I thought about it with a Seagate drive that went bad but in the end it really wasn't worth it to me since everything was replaceable.

From my understanding the "cooling" of a drive worked when we had older drives with much lower density but it's much less likely and getting condensate in and on the drives is worse than to leave it alone.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Thanks. I think I'll order one of those "cloning"-capable docks on the off chance one of these drives can be resurrected, and if not it would be handy to have anyway (as if I don't already have enough external docks).
You might have luck in cloning the drive(s) using something like CloneZilla and then importing then into a Virtualized environment or dropping the drive image on a new drive and then trying to use that drive in the pool/vdev. However, this would require resources and ability to do so. There is NO GUARANTEE it will work though. From a "10 mile view"; I would defer to to say that you have lost the pool/data (Sorry, just my honest opinion).

not to mention crap, Seagate drives.
Careful, this as of lately has become a touchy subject for some (not me personally). ;)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Careful, this as of lately has become a touchy subject for some (not me personally). ;)

Oh I just enjoy beating on irrational positions. You're welcome to cry "Seagate drives are crap," which is fine, cuz some of them are, but implying that this means you should buy Western Digital instead is ridiculous, since Western Digital's had similar problems over the years.

Anyways, not in any manufacturer's defense here, but at the 4 year old point, a 750GB drive is pretty long-in-tooth. Take a peek at, for example,

http://www.extremetech.com/computing/170748-how-long-do-hard-drives-actually-live-for

Which uses some Backblaze data that I suspect is a little skewed, but we generally acknowledge that at the 5 year point drive failures tend to go way up. This is why the paranoid among us use RAIDZ3.
 

brossow

Dabbler
Joined
Mar 17, 2012
Messages
15
Sorry, didn't mean to stoke a holy war. Of the five drives in the array, all of the same vintage, two are Seagates (different batches, different sources). Those are the two that failed within a few days of each other. I know there's not a manufacturer out there that someone isn't cursing due to failure and lost data. It just sucks that I went out of my way to ensure a good variety in this mix specifically to prevent this sort of thing and still got burned. I should have left off the editorial comment.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Anyone who loses data is well within their rights to editorial comments. It just goes to show that you can try to do it all right and still get screwed.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
The clicking may mean you have a worse problem, but I will mention that many of the Seagate 750G-era drives have a firmware bug that leads to non-destructive failure. If you never did a firmware update, you may have hit this. The data would be fully recoverable. Bad firmware would be something like SD15, good would be SD1A. Seagate or a recovery company could do it, although it was actually possible to build a serial cable/device that would revive the drive.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
The clicking may mean you have a worse problem, but I will mention that many of the Seagate 750G-era drives have a firmware bug that leads to non-destructive failure. If you never did a firmware update, you may have hit this. The data would be fully recoverable. Bad firmware would be something like SD15, good would be SD1A. Seagate or a recovery company could do it, although it was actually possible to build a serial cable/device that would revive the drive.

I've recovered a 1TB drive with this problem and documented it on my blog. Keep in mind though that is a specific problem with a specific series of drives. They start with a single "click" and then do nothing. If they are clicking repeatedly they are likely toast.
 

brossow

Dabbler
Joined
Mar 17, 2012
Messages
15
These are both repeatedly clicking, and to be honest I don't think there's anything on there that I would be willing to either (a) pay more than about $50 to recover or (b) jump through too many hoops to repair. I've ordered a cloning dock and will give both drives a few more non-heroic attempts at revival, but then I'm going to give up and chalk it up to karma for something I've forgotten that I even did.

Thanks to everyone for the input. I really appreciate the support.
 
Status
Not open for further replies.
Top