Unable to replace disk

Joined
Jan 17, 2015
Messages
8
I had a drive fail in a ZFS volume. I replaced the drive without taking it offline. Now I cannot get the drive replaced. In the GUI when replacing it does not have a Member Disk in the dropdown so I cannot replace. I was able to take the drive offline from the CLI but cannot replace it. I get an error when trying to do a zpool replace. I see that the gptid is still on the old drive. I have the new gptid but cannot get it attached to /dev/da2. Any help would be appreciated. Can provide any extra information that is needed. Thanks in advance.

Phil
 
Joined
Jan 17, 2015
Messages
8
I had a drive fail in a ZFS volume. I replaced the drive without taking it offline. Now I cannot get the drive replaced. In the GUI when replacing it does not have a Member Disk in the dropdown so I cannot replace. I was able to take the drive offline from the CLI but cannot replace it. I get an error when trying to do a zpool replace. I see that the gptid is still on the old drive. I have the new gptid but cannot get it attached to /dev/da2. Any help would be appreciated. Can provide any extra information that is needed. Thanks in advance.

Phil
Follow up with more information. Here is what zpool status <zpoolname> shows for the replaced disk
16071527640338614044 OFFLINE 0 0 0 was /dev/gptid/2bd1ce9c-506c-11e3-9635-0002b3a300b0
In the /dev/gptid I have the new disk that is
c4355a0f-4d6c-11e3-a690-0025900ba372
If I do a zpool replace <zpoolname> /dev/gptid/c4355a0f-4d6c-11e3-a690-0025900ba372 /dev/da2 I get a
cannot replace /dev/gptid/c4355a0f-4d6c-11e3-a690-0025900ba372 with /dev/da2: no such device in pool
I have access to the CLI and GUI but am 3 hours from the server itself. Had some boots on ground replace the drive for me.
Thanks

Phil
 

skirven

Cadet
Joined
Feb 27, 2012
Messages
9
Hi,
I'm actually in a similar situation, so I'd like to know your solution. Slightly different scenario. I had a drive come detached, and didn't realize it. My zpool -v is:

Code:
root@freenas] /dev/gptid# zpool status -v
  pool: Main
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub in progress since Mon Jan 19 19:47:06 2015
        138G scanned out of 3.92T at 122M/s, 9h0m to go
        0 repaired, 3.42% done
config:

    NAME                                            STATE     READ WRITE CKSUM
    Main                                            DEGRADED     0     0     0
      raidz1-0                                      DEGRADED     0     0     0
        gptid/930c8f19-693f-11e4-b2a7-40167e65cb67  ONLINE       0     0     0
        gptid/93e7d9d7-693f-11e4-b2a7-40167e65cb67  ONLINE       0     0     0
        9416180299382493584                         UNAVAIL      0     0     0  was /dev/gptid/94c09499-693f-11e4-b2a7-40167e65cb67

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed Jan 14 03:45:58 2015
config:

    NAME                                          STATE     READ WRITE CKSUM
    freenas-boot                                  ONLINE       0     0     0
      gptid/ef7e329b-7ff7-11e4-962e-40167e65cb67  ONLINE       0     0     0

errors: No known data errors


I can't seem to mount the drive again. :( Any help would be appreciated! I'm on Freenas 9.3 Stable as of current fixes as well.
Thanks!
Stephen
 
Joined
Jan 17, 2015
Messages
8
Hi,
I'm actually in a similar situation, so I'd like to know your solution. Slightly different scenario. I had a drive come detached, and didn't realize it. My zpool -v is:

Code:
root@freenas] /dev/gptid# zpool status -v
  pool: Main
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub in progress since Mon Jan 19 19:47:06 2015
        138G scanned out of 3.92T at 122M/s, 9h0m to go
        0 repaired, 3.42% done
config:

    NAME                                            STATE     READ WRITE CKSUM
    Main                                            DEGRADED     0     0     0
      raidz1-0                                      DEGRADED     0     0     0
        gptid/930c8f19-693f-11e4-b2a7-40167e65cb67  ONLINE       0     0     0
        gptid/93e7d9d7-693f-11e4-b2a7-40167e65cb67  ONLINE       0     0     0
        9416180299382493584                         UNAVAIL      0     0     0  was /dev/gptid/94c09499-693f-11e4-b2a7-40167e65cb67

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed Jan 14 03:45:58 2015
config:

    NAME                                          STATE     READ WRITE CKSUM
    freenas-boot                                  ONLINE       0     0     0
      gptid/ef7e329b-7ff7-11e4-962e-40167e65cb67  ONLINE       0     0     0

errors: No known data errors


I can't seem to mount the drive again. :( Any help would be appreciated! I'm on Freenas 9.3 Stable as of current fixes as well.
Thanks!
Stephen

Stephen,
I have found no solution at this time. I was hoping someone on the forum would be able to give me a clue.

Phil
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,972
@phil,
Did you Offline or Replace the drive before the failed drive was removed? Has the system been rebooted? Did you follow the user manual for replacing a drive? What FreeNAS version are you using? (I know, obvious question) How about your system specs too, those might help.
 
Joined
Jan 17, 2015
Messages
8
@phil,
Did you Offline or Replace the drive before the failed drive was removed? Has the system been rebooted? Did you follow the user manual for replacing a drive? What FreeNAS version are you using? (I know, obvious question) How about your system specs too, those might help.

As I stated in the first post, I did not offline the drive before it was removed. That is what caused this issue. The system was taken down when the drive was replaced. As stated in the first post, No I did not follow the user manual as the drive was not taken offline. It is FreeNAS version 9.1.1. Supermicro server with 8 SAS drives off a LSI controller in JBOD mode. In the GUI the drives were labeled da0 - da7 when we replaced da2 it is showing up with the new serial number but I have lost the drive that was da3 and now only have da0 - da6. If there is anything I can provide that would help please let me know.

Phil
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,972
What does zpool status -v show?
And just to double check, there are no options in the GUI to Offline or Replace the drive (I suspect not but it's just asking stupid questions time).
 
Joined
Jan 17, 2015
Messages
8
What does zpool status -v show?
And just to double check, there are no options in the GUI to Offline or Replace the drive (I suspect not but it's just asking stupid questions time).

[root@freenas] ~# zpool status -v
pool: bkup
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 0 in 0h0m with 0 errors on Sun Dec 28 00:00:05 2014
config:

NAME STATE READ WRITE CKSUM
bkup DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
7586713522588527392 UNAVAIL 0 0 0 was /dev/gptid/c3cce3ae-4d6c-11e3-a690-0025900ba372
gptid/c4355a0f-4d6c-11e3-a690-0025900ba372 ONLINE 0 0 0

errors: No known data errors

pool: sdr
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 1h47m with 0 errors on Sun Jan 4 01:47:28 2015
config:

NAME STATE READ WRITE CKSUM
sdr DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/2a3730cc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2b165cdc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
16071527640338614044 OFFLINE 0 0 0 was /dev/gptid/2bd1ce9c-506c-11e3-9635-0002b3a300b0
gptid/2c8724c8-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gptid/2d4787b2-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2e07e2ca-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2ec2ae48-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2f7d9b5f-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0

errors: No known data errors
 
Joined
Jan 17, 2015
Messages
8
[root@freenas] ~# zpool status -v
pool: bkup
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 0 in 0h0m with 0 errors on Sun Dec 28 00:00:05 2014
config:

NAME STATE READ WRITE CKSUM
bkup DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
7586713522588527392 UNAVAIL 0 0 0 was /dev/gptid/c3cce3ae-4d6c-11e3-a690-0025900ba372
gptid/c4355a0f-4d6c-11e3-a690-0025900ba372 ONLINE 0 0 0

errors: No known data errors

pool: sdr
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 1h47m with 0 errors on Sun Jan 4 01:47:28 2015
config:

NAME STATE READ WRITE CKSUM
sdr DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/2a3730cc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2b165cdc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
16071527640338614044 OFFLINE 0 0 0 was /dev/gptid/2bd1ce9c-506c-11e3-9635-0002b3a300b0
gptid/2c8724c8-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gptid/2d4787b2-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2e07e2ca-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2ec2ae48-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2f7d9b5f-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0

errors: No known data errors
[root@freenas] ~# zpool status -v
pool: bkup
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 0 in 0h0m with 0 errors on Sun Dec 28 00:00:05 2014
config:

NAME STATE READ WRITE CKSUM
bkup DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
7586713522588527392 UNAVAIL 0 0 0 was /dev/gptid/c3cce3ae-4d6c-11e3-a690-0025900ba372
gptid/c4355a0f-4d6c-11e3-a690-0025900ba372 ONLINE 0 0 0

errors: No known data errors

pool: sdr
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 1h47m with 0 errors on Sun Jan 4 01:47:28 2015
config:

NAME STATE READ WRITE CKSUM
sdr DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/2a3730cc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2b165cdc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
16071527640338614044 OFFLINE 0 0 0 was /dev/gptid/2bd1ce9c-506c-11e3-9635-0002b3a300b0
gptid/2c8724c8-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gptid/2d4787b2-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2e07e2ca-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2ec2ae48-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2f7d9b5f-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0

errors: No known data errors

Sorry missed the question on the GUI but that was in the first post also. When I try to do the replace there is no Member drive in the drop down box.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,972
I'm going to fall back now that there should be enough information for an expert in drive replacement (like @cyberjock or @Ericloewe ) to chime in because you used the CLI and if done wrong it really screws with things. I'm certain I could give you some advice here but since your data is in tact for now, lets see what someone else can say. This sure is a hard way to learn to offline the drive first.

My first thought was since the original drive is now offline, do you have another fresh new (unused) drive that someone could shutdown the system, swap out the drive, power up, and then hopefully you will have the option to Replace the drive in the GUI (I'm not holding my breath).

Whatever you do, do not use the -f parameter to force the replacement unless someone tells you that you can. This is a last ditch effort in my opinion.
 
Joined
Jan 17, 2015
Messages
8
I'm going to fall back now that there should be enough information for an expert in drive replacement (like @cyberjock or @Ericloewe ) to chime in because you used the CLI and if done wrong it really screws with things. I'm certain I could give you some advice here but since your data is in tact for now, lets see what someone else can say. This sure is a hard way to learn to offline the drive first.

My first thought was since the original drive is now offline, do you have another fresh new (unused) drive that someone could shutdown the system, swap out the drive, power up, and then hopefully you will have the option to Replace the drive in the GUI (I'm not holding my breath).

Whatever you do, do not use the -f parameter to force the replacement unless someone tells you that you can. This is a last ditch effort in my opinion.

Thank you very much for your help. Hopefully someone can help me fix the mess that I created.

Phil
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,176
[root@freenas] ~# zpool status -v
pool: bkup
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: scrub repaired 0 in 0h0m with 0 errors on Sun Dec 28 00:00:05 2014
config:

NAME STATE READ WRITE CKSUM
bkup DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
7586713522588527392 UNAVAIL 0 0 0 was /dev/gptid/c3cce3ae-4d6c-11e3-a690-0025900ba372
gptid/c4355a0f-4d6c-11e3-a690-0025900ba372 ONLINE 0 0 0

errors: No known data errors

pool: sdr
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub repaired 0 in 1h47m with 0 errors on Sun Jan 4 01:47:28 2015
config:

NAME STATE READ WRITE CKSUM
sdr DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/2a3730cc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2b165cdc-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
16071527640338614044 OFFLINE 0 0 0 was /dev/gptid/2bd1ce9c-506c-11e3-9635-0002b3a300b0
gptid/2c8724c8-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gptid/2d4787b2-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2e07e2ca-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2ec2ae48-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0
gptid/2f7d9b5f-506c-11e3-9635-0002b3a300b0 ONLINE 0 0 0

errors: No known data errors

Have you actually looked in Volumes -» View Volumes -» sdr -» Volume Status? That's where the Replace/Offline options reside. Try to follow the manual from there.

If that screen doesn't show you the same layout as zpool status, first ensure your backups are in good condition. Next I'd try exporting and importing the pool to get the GUI synced with the pool status and proceed according to the manual from there.

The vdevs are still relatively healthy and completely healthy, respectively, so above all, don't panic and don't force things.
 
Joined
Jan 17, 2015
Messages
8
Have you actually looked in Volumes -» View Volumes -» sdr -» Volume Status? That's where the Replace/Offline options reside. Try to follow the manual from there.

If that screen doesn't show you the same layout as zpool status, first ensure your backups are in good condition. Next I'd try exporting and importing the pool to get the GUI synced with the pool status and proceed according to the manual from there.

The vdevs are still relatively healthy and completely healthy, respectively, so above all, don't panic and don't force things.

In the volume status there is one drive offline with a long number as a name. All others are named da?p2 where ? is 0-6. If I click the replace button there is no member disk in the drop down and I cannot continue. I will get a good backup done and then try the export and import. Thank you for your assistance with this.

Phil
 

skirven

Cadet
Joined
Feb 27, 2012
Messages
9
OK - I got mine resilvering, so maybe I can help. :)
To recap my issue, I had disconnected the drive and didn't realize it, so it became orphaned. I was in a degraded state. When I looked in Disks, though, I could see them all listed. It looked like it was orphaned. I ended up being able to add the drive as a Spare, using the FreeNas GUI, then issuing the command "
pool replace Main /dev/gptid/94c09499-693f-11e4-b2a7-40167e65cb67 gptid/c0753057-a0ec-11e4-9370-40167e65cb67", it was able to start resilvering. So the first entry is the drive marked as Unavailable, and the 2nd entry was the new marker for the spare. Now, I'm seeing:

Code:
[root@freenas] ~# zpool status -v                                                 pool: Main
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jan 20 16:49:33 2015
        128G scanned out of 3.90T at 245M/s, 4h28m to go
        42.5G resilvered, 3.20% done
config:

    NAME                                              STATE     READ WRITE CKSUM
    Main                                              DEGRADED     0     0     0
      raidz1-0                                        DEGRADED     0     0     0
        gptid/930c8f19-693f-11e4-b2a7-40167e65cb67    ONLINE       0     0     0
        gptid/93e7d9d7-693f-11e4-b2a7-40167e65cb67    ONLINE       0     0     0
        spare-2                                       UNAVAIL      0     0     0
          9416180299382493584                         UNAVAIL      0     0     0  was /dev/gptid/94c09499-693f-11e4-b2a7-40167e65cb67
          gptid/c0753057-a0ec-11e4-9370-40167e65cb67  ONLINE       0     0     0  (resilvering)
    spares
      8767576005723409996                             INUSE     was /dev/gptid/c0753057-a0ec-11e4-9370-40167e65cb67

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed Jan 14 03:45:58 2015
config:

    NAME                                          STATE     READ WRITE CKSUM
    freenas-boot                                  ONLINE       0     0     0
      gptid/ef7e329b-7ff7-11e4-962e-40167e65cb67  ONLINE       0     0     0

errors: No known data errors
[root@freenas] ~#


Which leads me to think that I'll be good after resilvering. So in short, see if you can add the drive as a Spare, then activate it.
-Stephen
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
Under Storage goto Disks and click on Wipe.
 
Top