Drives Unavailable Issue

Status
Not open for further replies.

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
Forgive me i'm relatively new to this but I'm trying to figure out why I have 7 drives in my array that are suddenly Unavailable. I've powered down the machine, checked the hardware array to ensure the disks were talking to the frame and they are. The Pool is showing "Error getting available space" in the storage tab. Here is what I see after typing "zpool import":

Code:
 
[root@Storage_Server_1 ~]# zpool import
pool: Storage_Server_1-2
id: 4126844871692108960
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-3C
config:

Storage_Server_1-2 UNAVAIL insufficient replicas
raidz2-0 UNAVAIL insufficient replicas
881286988809705582 UNAVAIL cannot open
16951986136311385172 UNAVAIL cannot open
13880951671858679305 UNAVAIL cannot open
761683466651723552 OFFLINE
15406529968983763880 UNAVAIL cannot open
15309855938180331212 UNAVAIL cannot open
15334817703217746897 UNAVAIL cannot open
1120748788262365741 UNAVAIL cannot open
3867962789067065351 UNAVAIL cannot open
564971118873581478 UNAVAIL cannot open
gptid/fce69c17-63b0-11e5-beda-002215883973 ONLINE
gptid/fdbdc471-63b0-11e5-beda-002215883973 ONLINE
gptid/fe93157c-63b0-11e5-beda-002215883973 ONLINE
gptid/ff774f1f-63b0-11e5-beda-002215883973 ONLINE
gptid/00715a4c-63b1-11e5-beda-002215883973 ONLINE
gptid/014fd0bd-63b1-11e5-beda-002215883973 ONLINE
gptid/02251362-63b1-11e5-beda-002215883973 ONLINE
gptid/02fca670-63b1-11e5-beda-002215883973 ONLINE
gptid/03e2324b-63b1-11e5-beda-002215883973 ONLINE
gptid/04c51dc0-63b1-11e5-beda-002215883973 ONLINE
gptid/05ab343f-63b1-11e5-beda-002215883973 ONLINE
gptid/069f5a83-63b1-11e5-beda-002215883973 ONLINE
gptid/07884eb6-63b1-11e5-beda-002215883973 ONLINE


Is this recoverable or should I just detach it and start over again? The 22 drives show up in the Volume Manager.

The history to this is this system has been running perfectly for the last 2 months. I inherited it from another engineer and all of a sudden I had one drive fail. No big deal I thought, I offlined it, replaced it and powered back up to discover some sort of network problem and i couldn't access the GUI. Once that was fixed, I got into the GUI do discover these drives unavailable and the one I replaced showing as ONLINE now (#19).

Thank you
 

Fuganater

Patron
Joined
Sep 28, 2015
Messages
477
Wait... you have all those drives in 1 vdev?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yeah, but it doesn't mean it's well setuped.

I recommend to destroy this pool and make a new one with vdevs not larger than 10 drives ;)
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
That's why I wanted to ask the group hahaha. I'm a broadcast engineer with enough IT knowledge to be dangerous, you guys (and girls) are the experts. I'll wipe and rebuild.

Is it possible to recover anything (without spending a ton)? Thank you for taking a look.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
It should be possible to recover if you can bring those drives online. Your pool configuration isn't optimal, and really should be changed when you have a chance (which unfortunately will mean destroying and rebuilding the pool), but it isn't what's causing 6 of your drives to disappear. What's your hardware, in detail? Motherboard, CPU(s), RAM, disk controller(s), backplane(s), etc. And which exact version of FreeNAS are you running?
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
That's good to hear! The system is running on 9.3 Stable right now.

So as far as them disappearing, if I go into Volume Manager they show up for some reason. That is where i'm confused because I would think they wouldn't show up in there if there's truly a hardware issue. Am I wrong in that theory?

Working on gathering hardware details.
 
Last edited:

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
Here's what I can find, regarding the hardware information:
Running FreeNas 9.3
Motherboard: Asus-DSEB-DG/SAS
Chips: Intel Xeon CPU E5405 @ 2GHz
Frame: Aberdeen AberNas LX
40 Bay - 1TB seagate drives - JBOD raid config in the hardware

I hope that is helpful.

Thank you
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
JBOD raid config in the hardware
Is that a hardware RAID controller in JBOD mode, or has it been flashed to IT mode?

I believe you were in another thread talking about how this problem occurred right after you replaced a drive. Seems to me almost certain that 1 of 2 things happened:
  1. Physical disturbance occurred when you replaced the drive, and double-checking all the data and power connections might fix it.
  2. The RAID controller is upset about one of the drives being removed, because it wasn't flashed to IT mode and it's interposing itself between ZFS and the hard drives.
If it's #2, you probably won't be able to recover the pool unless you can somehow get the failed drive working again.
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
Thank you Robert! Yes, I posted the network issue in the networking section (trying to follow the rules) and that problem is fixed.

So there are two hardware controllers and it was setup as JBOD in those and then the drives are managed in the FreeNas software. I'm going to try shutting it down again and pull the cover off to see if there's something that might have been jarred loose and then reseat the drive and if that doesn't work, then I'll probably just detach the pool, wipe the drives and start over ugh.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
there are two hardware controllers and it was setup as JBOD in those
Time for obvious questions:
  1. Are all the UNAVAIL drives on the same controller?
  2. Is is the same controller as the drive you replaced?
  3. What happens if you replace the new drive with the failed drive?
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
1: yes but what has me stumped is that when I log into the controllers, I can see the drives and identify them.
2: Yes
3: I tried that and nothing changed.

So after trying three again, I now have 11 drives unavail instead of 7. I'm really lost now.

Error: CRITICAL: The following multipaths are not optimal: disk13, disk2, disk3, disk4, disk5, disk6, disk7, disk8, disk9, disk10, disk11, disk12, disk1
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
when I log into the controllers, I can see the drives and identify them
Sounds like the controllers are not in IT mode, which is probably the root of the problem.
I now have 11 drives unavail instead of 7
Glad to be of service :oops:
The following multipaths are not optimal
Multipaths? Sorry, I'm in over my head now. Something about the setup is designed for redundancy, but I have precisely 0 experience with multipaths.
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
Here is what it shows:
Code:
[root@Storage_Server_1 ~]# camcontrol devlist
<Seagate ST31000340NS R001> at scbus0 target 0 lun 0 (pass0,da0)
<Seagate ST31000340NS R001> at scbus0 target 0 lun 1 (pass1,da1)
<Seagate ST31000340NS R001> at scbus0 target 0 lun 2 (pass2,da2)
<Seagate ST1000NM0011 R001> at scbus0 target 0 lun 3 (pass3,da3)
<SEAGATE ST31000340NS R001> at scbus0 target 0 lun 4 (pass4,da4)
<Seagate ST31000340NS R001> at scbus0 target 0 lun 5 (pass5,da5)
<Seagate ST31000340NS R001> at scbus0 target 0 lun 6 (pass6,da6)
<Seagate ST31000340NS R001> at scbus0 target 0 lun 7 (pass7,da7)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 0 (pass8,da8)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 1 (pass9,da9)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 2 (pass10,da10)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 3 (pass11,da11)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 4 (pass12,da12)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 5 (pass13,da13)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 6 (pass14,da14)
<Seagate ST31000340NS R001> at scbus0 target 1 lun 7 (pass15,da15)
<Seagate ST31000340NS R001> at scbus0 target 2 lun 0 (pass16,da16)
<Seagate ST31000340NS R001> at scbus0 target 2 lun 1 (pass17,da17)
<Seagate ST31000340NS R001> at scbus0 target 2 lun 2 (pass18,da18)
<Seagate ST31000340NS R001> at scbus0 target 2 lun 3 (pass19,da19)
<Seagate ST31000340NS R001> at scbus0 target 2 lun 4 (pass20,da20)
<Seagate ST31000340NS R001> at scbus0 target 2 lun 5 (pass21,da21)
<Areca RAID controller R001> at scbus0 target 16 lun 0 (pass22)
<Seagate ST31000340NS R001> at scbus1 target 1 lun 0 (pass23,da22)
<Seagate ST31000340NS R001> at scbus1 target 1 lun 2 (pass24,da23)
<Seagate ST31000340NS R001> at scbus1 target 1 lun 3 (pass25,da24)
<Seagate ST31000340NS R001> at scbus1 target 1 lun 4 (pass26,da25)
<Seagate ST31000340NS R001> at scbus1 target 1 lun 5 (pass27,da26)
<Seagate ST31000340NS R001> at scbus1 target 1 lun 6 (pass28,da27)
<Seagate ST1000NM0011 R001> at scbus1 target 1 lun 7 (pass29,da28)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 0 (pass30,da29)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 1 (pass31,da30)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 2 (pass32,da31)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 3 (pass33,da32)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 4 (pass34,da33)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 5 (pass35,da34)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 6 (pass36,da35)
<Seagate ST31000340NS R001> at scbus1 target 2 lun 7 (pass37,da36)
<Areca RAID controller R001> at scbus1 target 16 lun 0 (pass38)
<MATSHITA DVD-ROM SR-8178 PY19> at scbus4 target 0 lun 0 (pass39,cd0)
<ST380815AS 4.AAB> at scbus5 target 0 lun 0 (pass40,ada0)
<ST380815AS 3.AAD> at scbus6 target 0 lun 0 (pass41,ada1)
<SanDisk Cruzer 1.01> at scbus8 target 0 lun 0 (pass42,da37)
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
Does anything look out of place danb35? I'm going to wipe this thing and start over by the end of the day unless anyone has a solution today b/c I need this operational. Thanks.
 

COTVJosh

Dabbler
Joined
Sep 21, 2015
Messages
24
Yeah, that seems to be the ongoing theme with this machine for some reason. Thank you for taking the time to review it though!
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I would focus on trying to eliminate the multipath issue, just wish I knew more about it.
 
Status
Not open for further replies.
Top