Replace a drive without a spare

Status
Not open for further replies.

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
I have a running a FreeNAS-8.2.0-RELEASE-p1-x64 (r11950) machine with 8 drive bays, all members of a raidz2 pool.
I want to replace a drive with smart errors but have no spares configured, nor space for extra disks.

I've looked into the documentation for the right procedure to replace a disk, but both methods don't seem to work for me.

The first method uses the 'replace' feature where you can select a new disk, but a new disk is not available because of the fact that all disks are in use.
The seccond method is booting without the disk you want to replace, but when I do this the pool won't start and I cannot view the volumes.
The third suggestion is removing the disk or failing it by sending it ata faults so that freenas puts the status on offline. I followed that suggestion but it created another problem (http://forums.freenas.org/showthrea...vice-cannot-be-removed-from-pool-force-remove). I had to power the server off, put the old disk in and am back at square one.

So, does anyone know what the right procedure is to replace a disk in a running system without spares?
Or am I forced to add a spare with an external usb disk?
 

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi mnt_schred,

It's pretty simple.

Offline the bad disk so the system can remove it properly
Shutdown & physically replace the drive
Boot back up, go back into the gui, select the old drive, hit "Replace" and select the new drive

-Will
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
ah, yeah, forgot that one. That was method 4 that I've tried. When I try to offline a disk I get this message:

Disk offline failed: "cannot offline ada6p2: no valid replicas, "
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
From a SSH session as root paste the output of:
Code:
zpool status -v

camcontrol devlist

gpart show

glabel status
Throw the output inside of some [code][/code] tags as it will preserve the formatting and keep my eyes from crossing.
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
zpool status -v :
Code:
  pool: fs1
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver in progress for 1h2m, 1.19% done, 87h6m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fs1                         DEGRADED     0     0     0
          raidz2                    DEGRADED     0     0     0
            ada0p2                  ONLINE       0     0     0
            ada1p2                  ONLINE       0     0     0
            ada2p2                  ONLINE       0     0    61  7.40G resilvered
            ada3p2                  ONLINE       0     0    60  7.40G resilvered
            ada4p2                  ONLINE       0     0     0
            ada5p2                  ONLINE       0     0     0
            replacing               DEGRADED     0     0     0
              ada6p2                ONLINE       0     0     0
              15616410834711156094  UNAVAIL      0     0     0  was /dev/gptid/a                                                                              4504bf5-dfba-11e1-af32-90fba63af2d7
            ada7p2                  ONLINE       0     0     0
errors: Permanent errors have been detected in the following files:

        <metadata>:<0x19>
        <metadata>:<0x1b>
        <metadata>:<0x24>
        <metadata>:<0x29>
        <metadata>:<0x2e>
        <metadata>:<0x2f>
        <metadata>:<0x36>
        <metadata>:<0x37>
        <metadata>:<0x3c>
        <metadata>:<0x3e>
        <metadata>:<0x3f>
        <metadata>:<0x40>
        <metadata>:<0x41>
        <metadata>:<0x44>
        <metadata>:<0x46>
        <metadata>:<0x4a>
        <metadata>:<0x4c>
        <metadata>:<0x51>
        <metadata>:<0x53>
        <metadata>:<0x55>
        <metadata>:<0x56>
        <metadata>:<0x59>
        <metadata>:<0x5d>
        <metadata>:<0x60>
        <metadata>:<0x64>
        <metadata>:<0x70>
        <metadata>:<0x71>
        <metadata>:<0x72>
        <metadata>:<0x73>
        <metadata>:<0x74>
        fs1:<0x0>
        /mnt/fs1/5/file.jpg



camcontrol devlist :

Code:
 
[thies@fs1] /mnt/fs1/priveserver# camcontrol devlist
<SAMSUNG HD204UI 1AQ10001>         at scbus2 target 0 lun 0 (pass0,ada0)
<SAMSUNG HD204UI 1AQ10001>         at scbus3 target 0 lun 0 (pass1,ada1)
<SAMSUNG HD204UI 1AQ10001>         at scbus4 target 0 lun 0 (pass2,ada2)
<WDC WD20EARS-00MVWB0 51.0AB51>    at scbus4 target 1 lun 0 (pass3,ada3)
<ST2000DL003-9VT166 CC3C>          at scbus5 target 0 lun 0 (pass4,ada4)
<SAMSUNG HD204UI 1AQ10001>         at scbus5 target 1 lun 0 (pass5,ada5)
<WDC WD20EARS-00MVWB0 51.0AB51>    at scbus6 target 0 lun 0 (pass6,ada6)
<SAMSUNG HD204UI 1AQ10001>         at scbus7 target 0 lun 0 (pass7,ada7)
<Verbatim  8.07>                   at scbus8 target 0 lun 0 (pass8,da0)


gpart show :
Code:
=>     63  7802802  da0  MBR  (3.7G)
       63  1930257    1  freebsd  (943M)
  1930320       63       - free -  (32K)
  1930383  1930257    2  freebsd  [active]  (943M)
  3860640     3024    3  freebsd  (1.5M)
  3863664    41328    4  freebsd  (20M)
  3904992  3897873       - free -  (1.9G)

=>      0  1930257  da0s1  BSD  (943M)
        0       16         - free -  (8.0K)
       16  1930241      1  !0  (943M)

=>      0  1930257  da0s2  BSD  (943M)
        0       16         - free -  (8.0K)
       16  1930241      1  !0  (943M)

=>        34  3907029101  ada0  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada1  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada2  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada3  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada4  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada5  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada6  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)

=>        34  3907029101  ada7  GPT  (1.8T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  3902834703     2  freebsd-zfs  (1.8T)


glabel status :
Code:
                  Name  Status  Components
         ufs/FreeNASs3     N/A  da0s3
         ufs/FreeNASs4     N/A  da0s4
ufsid/4fa405ab96518680     N/A  da0s1a
        ufs/FreeNASs1a     N/A  da0s1a
        ufs/FreeNASs2a     N/A  da0s2a
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
The drive has been replaced you should be able to:

# zpool detach fs1 15616410834711156094

If that doesnt work try booting 8.3 ALPHA only once and do it again...

I also recommend a scrub afterwards..

# zpool scrub fs1
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
$zpool detach fs1 15616410834711156094
cannot detach 15616410834711156094: no valid replicas

By booting in 8.3 alpha I assume you mean the F2 option? F2 didn't work, but with ALT-2 I could select it anyway.
I've booted in F2 but the system still boots in FreeNAS-8.2.0-RELEASE-p1-x64 (r11950)

Is there another way of booting in the other version?
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
just burn the iso and boot once? or do a new install on a new usb and boot from 8.3 alpha?
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Install on a new usb and try to use it just once... to try the "zpool detach" command again, and then put the 8.2 usb back...

8.3 has a newer ZFS version which will likely fix the issue...
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
I've booted into 8.3 alpha on a new USB, but my volume is not found by zpool. (no volumes defined)
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
Okay, the import worked. However, $zpool detach fs1 15616410834711156094 again gave the error 'no valid replicas' :(
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
Code:
continue to function, possibly in a degraded state.                     
action: Wait for the resilver to complete.                                      
  scan: resilver in progress since Tue Aug  7 07:01:37 2012                     
        122G scanned out of 4.88T at 23.1M/s, 60h0m to go                       
        15.3G resilvered, 2.44% done                                            
config:                                                                         
                                                                                
        NAME                        STATE     READ WRITE CKSUM                  
        fs1                         DEGRADED     0     0     0                  
          raidz2-0                  DEGRADED     0     0     0                  
            ada0p2                  ONLINE       0     0     0                  
            ada1p2                  ONLINE       0     0     0                  
            ada2p2                  ONLINE       0     0    74  (resilvering)   
            ada3p2                  ONLINE       0     0    55  (resilvering)   
            ada4p2                  ONLINE       0     0     0                  
            ada5p2                  ONLINE       0     0     0                  
            replacing-6             DEGRADED     0     0     0                  
              ada6p2                ONLINE       0     0     0  (resilvering)   
              15616410834711156094  UNAVAIL      0     0     0  was /dev/gptid/a
4504bf5-dfba-11e1-af32-90fba63af2d7                                             
            ada7p2                  ONLINE       0     0     0                  
                                                                                
errors: 8922 data errors, use '-v' for a list 
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
yeah, I understand. The problem is that I want to replace ada2, ada3 and ada6 because of smart errors.. and these are the resilvering devices.
Is there a way to increase the resilver speed?
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
Maybe it's a better idea to export the settings, backup the data and do a fresh install... Can you backup freenas' users and settings?
 

mnt_schred

Dabbler
Joined
Aug 6, 2012
Messages
29
Well, the problem is that this is a live backupserver and I want the quickest solution. And the fact the the resilvering disks are the failing disks is not a small risk.

But I got another server for the backuptasks and now I can continue the original plan.
This is what I am planning to do:

1) dd my 8.2 usb stick to a similar usb stick
2) upgarde the copy to 8.3 beta
3) resilver the disks
4) try detach/offline procedure

what do you think?

edit: i did it anyway and now it is resilvering a lot faster:

Code:
        continue to function, possibly in a degraded state.                     
action: Wait for the resilver to complete.                                      
  scan: resilver in progress since Wed Aug  8 11:19:09 2012                     
        55.4G scanned out of 4.88T at 114M/s, 12h20m to go                      
        13.1M resilvered, 1.11% done                                            
config:                                                                         
                                                                                
        NAME                        STATE     READ WRITE CKSUM                  
        fs1                         DEGRADED     0     0     0                  
          raidz2-0                  DEGRADED     0     0     0                  
            ada0p2                  ONLINE       0     0     0                  
            ada1p2                  ONLINE       0     0     0                  
            ada2p2                  ONLINE       0     0     0  (resilvering)   
            ada3p2                  ONLINE       0     0     0  (resilvering)   
            ada4p2                  ONLINE       0     0     0                  
            ada5p2                  ONLINE       0     0     0                  
            replacing-6             DEGRADED     0     0     0                  
              ada6p2                ONLINE       0     0     0  (resilvering)   
              15616410834711156094  UNAVAIL      0     0     0  was /dev/gptid/a
4504bf5-dfba-11e1-af32-90fba63af2d7                                             
            ada7p2                  ONLINE       0     0     0                  
                                                                                
errors: No known data errors                  
 
Status
Not open for further replies.
Top