Issues with two drives, looking for some help.

Status
Not open for further replies.

lpittman

Dabbler
Joined
May 2, 2013
Messages
35
Hey everyone,

My FreeNAS box has been running fine for over a year now without any issues. Recently I began noticing a sector error with one of the drives, so I went to replace it but it seems I have made things worse. I've been trying to work it out myself, however I am worried about my data at this point and am hoping for some help.

I have too much data on the pool to backup unfortunately. All the mission critical stuff is backed up however.

Disk 12129061989276932477 is the one with errors and needs to be physically replaced. However the second disk that is faulted has never shown any errors, it was unfortunately removed instead of the other disk. Both disks are installed and have power. I tried to shutdown the server to replace the disk, but it won't shutdown. So, I have no GUI, only CLI.

Here is some information. If I missed anything important let me know.

Code:
[root@freenas] /var/log# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 0 in 4h49m with 0 errors on Wed May 28 14:19:13 2014
config:
 
        NAME                                            STATE     READ WRITE CKSUM
        tank                                            DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            gptid/67f41f2e-b5b4-11e2-80a1-001a4d5a464d  ONLINE       0     0     0
            gptid/6872deef-b5b4-11e2-80a1-001a4d5a464d  ONLINE       0     0     0
            gptid/68f30a78-b5b4-11e2-80a1-001a4d5a464d  ONLINE       0     0     0
            gptid/697049b0-b5b4-11e2-80a1-001a4d5a464d  ONLINE       0     0     0
            12129061989276932477                        OFFLINE      0     0     0  was /dev/da11
            gptid/6a6bb924-b5b4-11e2-80a1-001a4d5a464d  ONLINE       0     0     0
          raidz2-1                                      DEGRADED     0     0     0
            gptid/f03441ae-b615-11e2-82ec-001a4d5a464d  ONLINE       0     0     0
            gptid/f257c65f-b615-11e2-82ec-001a4d5a464d  FAULTED      3     0     0  too many errors
            gptid/f3e21ad8-b615-11e2-82ec-001a4d5a464d  ONLINE       0     0     0
            gptid/f4764cd6-b615-11e2-82ec-001a4d5a464d  ONLINE       0     0     0
            gptid/f609d8ab-b615-11e2-82ec-001a4d5a464d  ONLINE       0     0     0
            gptid/55d57ec3-b692-11e2-8734-001a4d5a464d  ONLINE       0     0     0
        cache
          gptid/574cf05e-b616-11e2-82ec-001a4d5a464d    ONLINE       0     0     0


Code:
[root@freenas] /var/log# camcontrol devlist
<ATA WDC WD20EARX-00P AB51>        at scbus0 target 9 lun 0 (pass0,da0)
<ATA WDC WD20EARX-00P AB51>        at scbus0 target 10 lun 0 (da1,pass1)
<ATA WDC WD20EARX-00P AB51>        at scbus0 target 11 lun 0 (pass2,da2)
<ATA WDC WD20EARX-00P AB51>        at scbus0 target 12 lun 0 (pass3,da3)
<ATA WDC WD30EZRX-00D 0A80>        at scbus0 target 13 lun 0 (pass4,da4)
<ATA WDC WD20EARX-00P AB51>        at scbus0 target 14 lun 0 (pass5,da5)
<ATA INTEL SSDSC2CT06 300i>        at scbus0 target 15 lun 0 (pass6,da6)
<ATA WDC WD20EZRX-00D 0A80>        at scbus0 target 16 lun 0 (pass7,da7)
<ATA WDC WD20EZRX-00D 0A80>        at scbus0 target 17 lun 0 (pass8,da8)
<ATA WDC WD20EZRX-00D 0A80>        at scbus0 target 18 lun 0 (pass9,da9)
<ATA WDC WD20EZRX-00D 0A80>        at scbus0 target 19 lun 0 (pass10,da10)
<ATA WDC WD20EARS-00M AB51>        at scbus0 target 21 lun 0 (pass11,da11)
<ATA WDC WD20EARS-00M AB51>        at scbus0 target 22 lun 0 (pass12,da12)
<Lexar JumpDrive 1100>             at scbus5 target 0 lun 0 (pass13,da13)

Code:
[root@freenas] /var/log# gpart show
=>        34  3907029101  da0  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>        34  3907029101  da2  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>        34  3907029101  da3  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>        34  5860533101  da4  GPT  (2.7T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  5856338696    2  freebsd-zfs  (2.7T)
  5860533128           7       - free -  (3.5k)
=>        34  3907029101  da5  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>       34  117231341  da6  GPT  (55G)
         34         94       - free -  (47k)
        128    4194304    1  freebsd-swap  (2.0G)
    4194432  113036943    2  freebsd-zfs  (53G)
=>        34  3907029101  da7  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>        34  3907029101  da8  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>        34  3907029101  da9  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)
=>        34  3907029101  da10  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)
=>        34  3907029101  da12  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)
=>      63  62652353  da13  MBR  (29G)
        63   1930257     1  freebsd  [active]  (942M)
   1930320        63        - free -  (31k)
   1930383   1930257     2  freebsd  (942M)
   3860640      3024     3  freebsd  (1.5M)
   3863664     41328     4  freebsd  (20M)
   3904992  58747424        - free -  (28G)
=>      0  1930257  da13s1  BSD  (942M)
        0       16          - free -  (8.0k)
       16  1930241       1  !0  (942M)
=>        34  3907029101  da11  GPT  (1.8T)
          34          94        - free -  (47k)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834696     2  freebsd-zfs  (1.8T)
  3907029128           7        - free -  (3.5k)
=>        34  3907029101  da1  GPT  (1.8T)
          34          94       - free -  (47k)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3902834696    2  freebsd-zfs  (1.8T)
  3907029128           7       - free -  (3.5k)

Code:
[root@freenas] /var/log# glabel status
                                      Name  Status  Components
gptid/f03441ae-b615-11e2-82ec-001a4d5a464d     N/A  da0p2
gptid/f3e21ad8-b615-11e2-82ec-001a4d5a464d     N/A  da2p2
gptid/f4764cd6-b615-11e2-82ec-001a4d5a464d     N/A  da3p2
gptid/55d57ec3-b692-11e2-8734-001a4d5a464d     N/A  da4p2
gptid/f609d8ab-b615-11e2-82ec-001a4d5a464d     N/A  da5p2
gptid/574cf05e-b616-11e2-82ec-001a4d5a464d     N/A  da6p2
gptid/6872deef-b5b4-11e2-80a1-001a4d5a464d     N/A  da7p2
gptid/68f30a78-b5b4-11e2-80a1-001a4d5a464d     N/A  da8p2
gptid/6a6bb924-b5b4-11e2-80a1-001a4d5a464d     N/A  da9p2
gptid/67f41f2e-b5b4-11e2-80a1-001a4d5a464d     N/A  da10p2
gptid/697049b0-b5b4-11e2-80a1-001a4d5a464d     N/A  da12p2
                             ufs/FreeNASs3     N/A  da13s3
                             ufs/FreeNASs4     N/A  da13s4
                            ufs/FreeNASs1a     N/A  da13s1a
gptid/92f8d8d0-e6b0-11e3-b647-001a4d5a464d     N/A  da11p2
gptid/f237dfdf-b615-11e2-82ec-001a4d5a464d     N/A  da1p1
gptid/f257c65f-b615-11e2-82ec-001a4d5a464d     N/A  da1p2
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So what's the problem? You have 2 failed drives, one in each vdev. The fix is simply to add a disk back to each vdev. This isn't a particularly bad issue and is very recoverable.
 

lpittman

Dabbler
Joined
May 2, 2013
Messages
35
Thanks for the reply. I guess I wasn't very clear - the issue is that I tried to issue a shutdown however the GUI shutdown but the system itself did not. So I am logged in via PUTTY as root and am attempting to do this via CLI, however I have read that one should not attempt this via CLI as the GUI runs a series of commands to accomplish this. I just don't want to cause any more damage.

My guess would be to try:

Code:
zpool replace tank 8296585602626686865 da1


to start. Theoretically it should start a resilver and put that vdev back online, correct?

Thanks again.

Luke
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
You shouldn't be doing any disk replacements from the CLI. You'll just end up doing it twice.

So what's the status right now? You did a shutdown from the WebGUI but its not shutdown?

If that is the case you might just have to power cycle it. If the disk you offlined wasn't supposed to be the one you meant to pull put it back in the machine while the server is off. A scrub from the CLI later will give you back the lost redundancy and ensure consistency across the vdev.

Once the system is back up and the failed disk is replaced with a good disk follow our manual to do the disk replacement.

Disk replacements should only be performed in the WebGUI, and only per the manual. ;)
 

lpittman

Dabbler
Joined
May 2, 2013
Messages
35
Correct - it wouldn't shut down and I could not access WebGUI. I'm trying to avoid power cycling it until I've resolved these issues, if I can.

I was able to get the GUI started again by restarting ngix and django, however it is unbearably slow.

While in the GUI I tried to do the replacement, but it just went to UNAVAIL, then trying to hit "replace" I only have the option of selecting da1 or da11. So I selected da1, it processed for awhile then came up with an error. Unfortunately I wasn't able to copy/paste it as it blinked away almost immediately.

While searching for more information I did try this:

Code:
zpool replace tank 8296585602626686865 da1


as this is on the vdev that I DO have backed up and it is currently resilvering. I guess I'll have to wait and see if it works out. Once this is complete I'll shut it down and physcially swap out the actual disk that needs replacing.

Luke
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
wow.. just didn't listen to me did you....


well, good luck. Can't wait to see how long it takes for your pool to fail you. /facepalm
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526

lpittman

Dabbler
Joined
May 2, 2013
Messages
35
Actually I did listen to you. I worked on restarting the GUI and doing the disk replace from there but it didn't work as expected. I'm trying to get help but also work on it on my own instead of simply relying on other people. So instead of just coming back and saying "that didn't work" or "this is wrong" I took it upon myself to do further research based on what you said and what I just experienced and try what seemed logical.

I didn't just do the opposite of what you said. Take it easy.
 

lpittman

Dabbler
Joined
May 2, 2013
Messages
35
Just an update for anyone that might look here in the future. Once my backup was finished I was able to fully follow cyberjocks advice and do a power cycle. It booted back up and I was able to perform the disk replace from the gui with no problems at all. Was hoping to be able to accomplish all of this without the power cycle, but hey in the end it all worked out.

Sorry for being an 'askhole' ... was intentional. Was just trying to work it out and learn on my own too, while I was waiting for ma backup.

Cheers and thanks for the advice.
 
Status
Not open for further replies.
Top