Volume Degraded

htrain933

Cadet
Joined
Nov 9, 2016
Messages
8
Hello,
I'm a newbie, and I've done a bit of reading around here. My system is
FreeNAS-11.1-U6
Supermicro X11SSM-F
Intel(R) Xeon(R) CPU E3-1275 v5 @ 3.60GHz
16GB ECC RAM
8x2TB HDD (4 WD RED WD20EFRX & 4 Seagate Ironwolf ST2000VN000)

My system was showing an error:
The volume vol1 state is DEGRADED: One or more devices has been removed by the administrator.

So I rebooted, and now it's showing:
The volume vol1 state is DEGRADED: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state.

I've tried to look over some of the SMART tests, but I'm not really sure what I'm looking at, or how to tell what drives are bad. All 8 drives pass overall SMART short tests. Should I post the SMART outputs here?
So my question is - how do I tell what drive(s) or SATA cables are bad, and what steps should I take to protect my data. I do have backups, that's the good news! TIA.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hey htrain,

In the WebUI, you can go to Storage and Pool. There, you can click on the Settings and select Status. That should tell you (and us...) more about what is wrong.

Good luck and good for you to have backups,
 

htrain933

Cadet
Joined
Nov 9, 2016
Messages
8
Hi Hercules, it looks like ada0 is unavailable. Should I go straight to replacing the drive?
 

Attachments

  • Screen Shot 2019-09-25 at 9.23.11 PM.png
    Screen Shot 2019-09-25 at 9.23.11 PM.png
    252.8 KB · Views: 374

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi again,

In such a case, one of the very thing to avoid is to panic and act too fast. Thanks to your Raid-Z2, your pool is still safe and still has redundancy.

Is your hardware hotplug or not ? If it is not, you will have to power down before unplugging the drive.
Do you already have a spare drive ? Is it tested already ?

Can you plug an extra drive in your box ?

Also, know that is can be as simple as a bad connection. To unplug and replug the cable may bring your disk back online. Again, be sure to respect your hardware specs and power down if required.

Good luck recovering from that,
 

htrain933

Cadet
Joined
Nov 9, 2016
Messages
8
So I powered down the NAS and pulled the drive. I put the suspect drive in another system, and it's not detectable. So I'm not sure if I can test it at all. It's under warranty, so hopefully I can do an RMA.
I do have, so do you think it makes sense to replace the drive at this point?
 

IQless

Contributor
Joined
Feb 13, 2017
Messages
142
When you say "not detectable" do you mean, "Windows does not show it under -My Computer-" or "Its not detectable in bios"?

Anyway, probably best to replace it, if you have the opportunity to have a cold spare laying around, I would buy it while waiting for the RMA. That way you would have another cold drive for when the next time a drive bites the dust.
 

htrain933

Cadet
Joined
Nov 9, 2016
Messages
8
I mean it's not detecable in bios or via bootable media - PC doctor bootable USB.
I'll go ahead and replace the drive with a cold spare, like you mentioned.
I'm worried though that I've got something configured wrong that I didn't know I'm terms to SMART tests and/or email notifications.
Is there a log I could collect that would share whether I have my drive tests and scrubs setup correctly?: So that I would have a chance to know if a drive is failing before it actually died?
 

IQless

Contributor
Joined
Feb 13, 2017
Messages
142
I'm worried though that I've got something configured wrong that I didn't know I'm terms to SMART tests and/or email notifications.
Is there a log I could collect that would share whether I have my drive tests and scrubs setup correctly?: So that I would have a chance to know if a drive is failing before it actually died?
I used Uncle Fester's Guide, that helped me set up the testing schedule and notification. There is also an 11.2 version of the guide (still work In progress I think).
@Spearfoot have a great resource Github repository for FreeNAS scripts, including disk burnin with scripts for temps, smart status and more.

These are just some that I use, there are probably a lot more.
 
Top