Determining status of drive

Status
Not open for further replies.

westyx

Cadet
Joined
Apr 5, 2014
Messages
4
One of my drives in my freenas server appears to have had some problems and has been kicked out of the pool. I'm trying to run smartctl on it but I can't find it.

Hardware:
1. Motherboard: ASUS P9D WS
2. CPU: Intel i3-4150
3. RAM: 16GB ECC
4. Hard Drives: 1x USB used for OS (see below)
6x 2TB ST2000DM001-1ER164 Samsung drives
- all 6 samsung drives are directly plugged into the motherboard sata ports
5. Build: FreeNAS-9.3-STABLE-201504100216 (with all current zfs flags installed)
6. ZFS Pool: RAIDZ2

I saw that there was an alert in the GUI interface so I ran zpool status and got:

Code:
[root@freenas] ~# zpool status
  pool: Base
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 0h28m with 0 errors on Mon Apr 13 20:08:13 2015
config:

        NAME                                            STATE     READ WRITE CKSUM
        Base                                            DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            gptid/335a1344-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/339a8a90-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/33dab725-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/3431f09b-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/34747be3-3d51-11e4-875a-40167e36b520  FAULTED      0    11     0  too many errors
            gptid/34c6d71d-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0h4m with 0 errors on Tue Apr 14 00:04:27 2015
config:

        NAME                                          STATE     READ WRITE CKSUM
        freenas-boot                                  ONLINE       0     0     0
          gptid/66f468c1-8436-11e4-a43c-40167e36b520  ONLINE       0     0     0

errors: No known data errors


I then ran zpool clear and rebooted, and now I get:

Code:
[root@freenas] /dev# zpool status
  pool: Base
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0h28m with 0 errors on Mon Apr 13 20:08:13 2015
config:

        NAME                                            STATE     READ WRITE CKSUM
        Base                                            DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            gptid/335a1344-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/339a8a90-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/33dab725-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            gptid/3431f09b-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0
            5899279064343081067                         UNAVAIL      0     0     0  was /dev/gptid/34747be3-3d51-11e4-875a-40167e36b520
            gptid/34c6d71d-3d51-11e4-875a-40167e36b520  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0h4m with 0 errors on Tue Apr 14 00:04:27 2015
config:

        NAME                                          STATE     READ WRITE CKSUM
        freenas-boot                                  ONLINE       0     0     0
          gptid/66f468c1-8436-11e4-a43c-40167e36b520  ONLINE       0     0     0



Next step would be to run smartctl commands on it, but I can't find the disk anymore (should be 6 drives, on on scbus5 I imagine):

Code:
[root@freenas] /dev# camcontrol devlist
<ST2000DM001-1ER164 CC43>          at scbus1 target 0 lun 0 (pass0,ada0)
<ST2000DM001-1ER164 CC43>          at scbus2 target 0 lun 0 (pass1,ada1)
<ST2000DM001-1ER164 CC43>          at scbus3 target 0 lun 0 (pass2,ada2)
<ST2000DM001-1ER164 CC43>          at scbus4 target 0 lun 0 (pass3,ada3)
<ST2000DM001-1ER164 CC43>          at scbus6 target 0 lun 0 (pass4,ada4)
<SanDisk Cruzer Blade 1.27>        at scbus8 target 0 lun 0 (pass5,da0)


and

Code:
[root@freenas] /dev# ls -l | grep ad
lrwxr-xr-x   1 root  wheel        4 Apr 19 18:36 ad10@ -> ada3
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad10p1@ -> ada3p1
lrwxr-xr-x   1 root  wheel       10 Apr 19 18:37 ad10p1.eli@ -> ada3p1.eli
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad10p2@ -> ada3p2
lrwxr-xr-x   1 root  wheel        4 Apr 19 18:36 ad14@ -> ada4
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad14p1@ -> ada4p1
lrwxr-xr-x   1 root  wheel       10 Apr 19 18:37 ad14p1.eli@ -> ada4p1.eli
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad14p2@ -> ada4p2
lrwxr-xr-x   1 root  wheel        4 Apr 19 18:36 ad4@ -> ada0
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad4p1@ -> ada0p1
lrwxr-xr-x   1 root  wheel       10 Apr 19 18:37 ad4p1.eli@ -> ada0p1.eli
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad4p2@ -> ada0p2
lrwxr-xr-x   1 root  wheel        4 Apr 19 18:36 ad6@ -> ada1
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad6p1@ -> ada1p1
lrwxr-xr-x   1 root  wheel       10 Apr 19 18:37 ad6p1.eli@ -> ada1p1.eli
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad6p2@ -> ada1p2
lrwxr-xr-x   1 root  wheel        4 Apr 19 18:36 ad8@ -> ada2
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad8p1@ -> ada2p1
lrwxr-xr-x   1 root  wheel       10 Apr 19 18:37 ad8p1.eli@ -> ada2p1.eli
lrwxr-xr-x   1 root  wheel        6 Apr 19 18:36 ad8p2@ -> ada2p2
crw-r-----   1 root  operator  0x73 Apr 19 18:36 ada0
crw-r-----   1 root  operator  0x7f Apr 19 18:36 ada0p1
crw-r-----   1 root  operator  0xaf Apr 19 18:37 ada0p1.eli
crw-r-----   1 root  operator  0x81 Apr 19 18:36 ada0p2
crw-r-----   1 root  operator  0x75 Apr 19 18:36 ada1
crw-r-----   1 root  operator  0x83 Apr 19 18:36 ada1p1
crw-r-----   1 root  operator  0x93 Apr 19 18:37 ada1p1.eli
crw-r-----   1 root  operator  0x85 Apr 19 18:36 ada1p2
crw-r-----   1 root  operator  0x77 Apr 19 18:36 ada2
crw-r-----   1 root  operator  0x87 Apr 19 18:36 ada2p1
crw-r-----   1 root  operator  0x95 Apr 19 18:37 ada2p1.eli
crw-r-----   1 root  operator  0x89 Apr 19 18:36 ada2p2
crw-r-----   1 root  operator  0x79 Apr 19 18:36 ada3
crw-r-----   1 root  operator  0x8b Apr 19 18:36 ada3p1
crw-r-----   1 root  operator  0x97 Apr 19 18:37 ada3p1.eli
crw-r-----   1 root  operator  0x8d Apr 19 18:36 ada3p2
crw-r-----   1 root  operator  0x7b Apr 19 18:36 ada4
crw-r-----   1 root  operator  0x8f Apr 19 18:36 ada4p1
crw-r-----   1 root  operator  0x99 Apr 19 18:37 ada4p1.eli
crw-r-----   1 root  operator  0x91 Apr 19 18:36 ada4p2
lrwxr-xr-x   1 root  wheel       11 Apr 19 18:36 dumpdev@ -> /dev/ada0p1


Is the disk totally dead, or do I need to take it out and work on it in another computer? The data is backed up elsewhere as well.
 
Joined
Jan 9, 2015
Messages
430
You can try a new cable and Sata port, but a new drive will probably be in order.
 
Joined
Jan 9, 2015
Messages
430
Your welcome. Now, I didn't know everything and I'm not an IT professional. May want to see what some of the more experienced guys chime in and say. Good luck.
 

westyx

Cadet
Joined
Apr 5, 2014
Messages
4
I'll definitely keep an eye on this thread, but given the drive isn't appearing in the list of devices at all after a reboot means something is pretty bad. I'll have to pull the server out and start tracing sata cables (and marking things for next time) but that'll occur tomorrow so I'll see what else is suggested.
 
Joined
Jan 9, 2015
Messages
430
10-4
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The drive should still be listed by camcontrol -devlist

If not, it really died.
 

westyx

Cadet
Joined
Apr 5, 2014
Messages
4
I plugged it into my linux workstation and it powers on, makes some noises and then powers itself off without the linux kernel detecting it. Time to get it RMA'd. Thanks for the help :)
 
Status
Not open for further replies.
Top