Drive is "UNAVAIL", replacement failed - what next?

Status
Not open for further replies.

kthelen

Cadet
Joined
Jun 10, 2016
Messages
6
Hello all!

I have here a HP DL360 G6 running FreeNAS 9.10, with a LSI 9211-8i, a HP D2600 drive cage, and a dozen 4TB SAS drives. Recently I noticed my zpool's state was "DEGRADED", with one drive shown as "UNAVAIL".

I may have made a newbie mistake.

The only option available from the GUI was "Replace", so I clicked it. The only choice of replacement offered was da9. Simply choosing it wasn't good enough, so I checked Force. The box displayed "please wait", after which the GUI froze and wouldn't respond to subsequent requests. Knowing no other options, I ssh-ed in and rebooted.

After rebooting, everything was still working - but I still was down a drive, listed as "UNAVAIL". Unfortunately, now there were no choices of replacement drives in the Replace dialog box.

I determined my zpool now consists of drives da0-da10. Looking in /dev showed a da11 (presumably the "missing" drive) exists, but is not visible in the GUI.

I'm new to ZFS, and don't really want to just start experimenting - losing this pool wouldn't be the end of the world, but I'd rather avoid it.

What should I do?
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
After rebooting, everything was still working - but I still was down a drive, listed as "UNAVAIL". Unfortunately, now there were no choices of replacement drives in the Replace dialog box.

Unless you add a replacement drives to your array, there should be none available for replacement.

What you need to do is add a new, working drive to the array, and then tell FreeNAS to use that drive as a replacement.
 

kthelen

Cadet
Joined
Jun 10, 2016
Messages
6
Two problems:

1) I have every slot filled on the P2600, so to replace the failed drive, I'll first have to locate it and remove it. Only method I can come up with is to shut the whole system down, and check each drive against the list of serial numbers in the zpool, continuing until I hit the one that's not on the list. Is there a better way?

2) I still don't know for certain that the drive failed, or what happened in general - it was there, and now it's not. So I'm a little hesitant to just blindly replace. Once I find the drive in question, that is!
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Have you read the documentation on replacing a failed drive?: http://doc.freenas.org/9.10/freenas_storage.html?highlight=replace#replacing-a-failed-drive

check each drive against the list of serial numbers in the zpool

This is one of the reason why people recommend putting a sticker on the front of each drive with the drive's serial number (or a partial serial number), or at least documenting which serial number goes where. The serial number is the most reliable method for replacing a failed drive. I don't know if your setup has the ability to flash the activity LED for your drive, but that's another way to narrow down the drives.

I still don't know for certain that the drive failed, or what happened in general - it was there, and now it's not.

That's usually a pretty good indication that a drive failed. At the very least, it's the safest assumption moving forward. Once you have the array back to full working order, you can do some thorough tests on the possibly bad drive. Worst case, it's a bad drive, and you can go about RMAing it (if you still have warranty). Otherwise, if you can get the drive to pass all your tests, then you can store it for future use as another replacement.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
1) I have every slot filled on the P2600, so to replace the failed drive, I'll first have to locate it and remove it. Only method I can come up with is to shut the whole system down, and check each drive against the list of serial numbers in the zpool, continuing until I hit the one that's not on the list. Is there a better way?

I have here a HP DL360 G6 running FreeNAS 9.10, with a LSI 9211-8i

Since you are running a LSI (works with Cross-Flashed Perc H200 as well); You can try "sas2ircu"
See this thread for info "SuperMicro Red Light implimentation plans??"

* sas2ircu should already be in FreeNAS so no need to obtain it...

*** I personally do prefer to track by Serial Number since that is the most certain method. Most of us have a record of that. See Post#9 in this thread for an example: https://forums.freenas.org/index.ph...mber-of-failed-hdd-in-pool.41521/#post-265737
 
Status
Not open for further replies.
Top