unable to remove old faulted pool

Status
Not open for further replies.

dwoodard3950

Dabbler
Joined
Dec 16, 2012
Messages
18
I have had a long run of trying to remove and purge an old faulted pool. This started with a failed drive that I replaced. While it was resilvering, another failed. In the end I decided to "start over" as I had a backup. Now, when I reboot, the system fails to mount my newly created pool and the old pool (of same name) is listed from cli as FAULTED. I have tried to wipe disks, zpool labelclear and such, but still no luck. If I run zdb, I see details regarding the old pool and disk replacement. How do I remove the history of this faulted pool and recycle these drives for use again?

Details:
FreeNAS-8.3.1-RELEASE-p2-x64
HP
[root@rome ~]# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank - - - - - FAULTED -

zdbtank: version: 28 name: 'tank' state: 0 txg: 2399702 pool_guid: 16661726734032266985 hostid: 2270805793 hostname: '' vdev_children: 1 vdev_tree: type: 'root' id: 0 guid: 16661726734032266985 children[0]: type: 'raidz' id: 0 guid: 13347037953643918062 nparity: 1 metaslab_array: 31 metaslab_shift: 34 ashift: 12 asize: 3992209850368 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 439340949831327204 path: '/dev/gptid/76708775-5e7d-11e2-8efb-6805ca09f08c' phys_path: '/dev/gptid/76708775-5e7d-11e2-8efb-6805ca09f08c' whole_disk: 1 DTL: 33521 create_txg: 4 children[1]: type: 'replacing' id: 1 guid: 5301956354414286026 whole_disk: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 2323197472043909345 path: '/dev/gptid/af99d237-4b31-11e2-b9ae-6805ca09f08c' phys_path: '/dev/gptid/af99d237-4b31-11e2-b9ae-6805ca09f08c' whole_disk: 1 not_present: 1 DTL: 8540 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 2075023451200656815 path: '/dev/gptid/1735b055-c9a5-11e2-b4d4-6805ca09f08c' phys_path: '/dev/gptid/1735b055-c9a5-11e2-b4d4-6805ca09f08c' whole_disk: 1 DTL: 86017 create_txg: 4 resilvering: 1 children[2]: type: 'disk' id: 2 guid: 13740987679321907475 path: '/dev/gptid/b0400532-4b31-11e2-b9ae-6805ca09f08c' phys_path: '/dev/gptid/b0400532-4b31-11e2-b9ae-6805ca09f08c' whole_disk: 1 DTL: 8539 create_txg: 4 children[3]:
type: 'replacing' id: 3 guid: 15401327230276251409 whole_disk: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 2348113808455552577 path: '/dev/dsk/gptid/b0b5f7cf-4b31-11e2-b9ae-6805ca09f08c' phys_path: '/dev/gptid/b0b5f7cf-4b31-11e2-b9ae-6805ca09f08c' whole_disk: 1 DTL: 8538 create_txg: 4 offline: 1 children[1]: type: 'disk' id: 1 guid: 2714461895960674896 path: '/dev/gptid/9fb695f4-adb0-11e0-ae66-0025900b7d98' phys_path: '/dev/gptid/9fb695f4-adb0-11e0-ae66-0025900b7d98' whole_disk: 1 DTL: 106498 create_txg: 4 resilvering: 1
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403

dwoodard3950

Dabbler
Joined
Dec 16, 2012
Messages
18
Sorry about the mess for the code. I'll fix.

I did try and mark these disk as new with no success from the GUI. Further (when that failed), I did the following after destroying the pool;
Code:
dd if=/dev/zero of=/dev/ada[0,1,2,3] bs=512 count=34
 
dd if=/dev/zero of=/dev/ada[0,1,2,3] bs=512 seek=<end - 34>


However, I still see the following after reboot;
Code:
[root@rome] ~# zpool list
NAME  SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
tank      -      -      -      -      -  FAULTED  -


I then destroy the pool once again with;

Code:
[root@rome] ~# zpool destroy -f tank


The output from zdb is attached.

Is a write of zeros the entire disk req'd? What am I missing?
 

Attachments

  • zdb.txt
    3.2 KB · Views: 386

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I did try and mark these disk as new with no success from the GUI.
How did it fail, /var/log/messages? You need to stop the GUI from importing the pool. You could reset factory defaults to clear the db.

Is a write of zeros the entire disk req'd? What am I missing?
A quick wipe via the GUI is sufficient as long as GEOM isn't protecting the drives at the time.

What did you zdb -l anyway?
 

dwoodard3950

Dabbler
Joined
Dec 16, 2012
Messages
18
I don't see much in /var/log/messages other than the devices are picked up as shown below, but no mention of the pool which upon initiating a reboot was healthy. Now it is as follows;
Code:
[root@rome] ~# zpool list
NAME  SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
tank      -      -      -      -      -  FAULTED  -

And the log results;
Code:
Jun 20 14:12:36 rome kernel: ZFS filesystem version 5
Jun 20 14:12:36 rome kernel: ZFS storage pool version 28
Jun 20 14:12:36 rome kernel: GEOM_ELI: Device ada0p1.eli created.
Jun 20 14:12:36 rome kernel: GEOM_ELI: Encryption: AES-XTS 256
Jun 20 14:12:36 rome kernel: GEOM_ELI:    Crypto: software
Jun 20 14:12:36 rome kernel: GEOM_ELI: Device ada1p1.eli created.
Jun 20 14:12:36 rome kernel: GEOM_ELI: Encryption: AES-XTS 256
Jun 20 14:12:36 rome kernel: GEOM_ELI:    Crypto: software
Jun 20 14:12:36 rome kernel: GEOM_ELI: Device ada2p1.eli created.
Jun 20 14:12:36 rome kernel: GEOM_ELI: Encryption: AES-XTS 256
Jun 20 14:12:36 rome kernel: GEOM_ELI:    Crypto: software
Jun 20 14:12:36 rome kernel: GEOM_ELI: Device ada3p1.eli created.
Jun 20 14:12:36 rome kernel: GEOM_ELI: Encryption: AES-XTS 256
Jun 20 14:12:36 rome kernel: GEOM_ELI:    Crypto: software
Jun 20 14:12:36 rome kernel: GEOM_ELI: Device ada4p1.eli created.
Jun 20 14:12:36 rome kernel: GEOM_ELI: Encryption: AES-XTS 256
Jun 20 14:12:36 rome kernel: GEOM_ELI:    Crypto: software


The labels are present;
Code:
[root@rome] ~# glabel status
                                      Name  Status  Components
gptid/47fd2817-d9ed-11e2-a64a-6805ca09f08c    N/A  ada0p2
gptid/489f6c14-d9ed-11e2-a64a-6805ca09f08c    N/A  ada1p2
gptid/4a20da5e-d9ed-11e2-a64a-6805ca09f08c    N/A  ada2p2
gptid/4951d7b9-d9ed-11e2-a64a-6805ca09f08c    N/A  ada3p2


With regard to zdb; that's just it... no command line options work when the pool is in this state. If I have a valid pool I can use some of the more typical options (i.e. zdb -l vdev). However, in this instance, with the pool faulted, there is nothing unless I provide no command line options. That is when I observe the output as attached in the above post.

With the pool in this state, I can destroy it and create a new one of the same name and things will appear to work fine until a reboot.
Code:
[root@rome] ~# zpool destroy -f tank
[root@rome] ~# zpool list
no pools available
--- now create the new pool from the GUI ----
[root@rome] ~# zpool list
NAME  SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
tank  3.62T  1.96M  3.62T    0%  1.00x  ONLINE  /mnt
[root@rome] ~# zpool status
  pool: tank
state: ONLINE
  scan: none requested
config:
 
    NAME                                            STATE    READ WRITE CKSUM
    tank                                            ONLINE      0    0    0
      raidz1-0                                      ONLINE      0    0    0
        gptid/9d37c588-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
        gptid/9dbe3467-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
        gptid/9e5ad759-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
        gptid/9f248126-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
 
errors: No known data errors
 

dwoodard3950

Dabbler
Joined
Dec 16, 2012
Messages
18
More information on this as I never did solve it. The box is on a UPS and I just never shut it down (till now).

I just imported to a new name as follows;
zpool import -R /mnt <pool_guid> <new_pool_name>

This worked and I can now reboot and get the newly named pool each time. However, the faulted (old name) pool remains in the output from zpool list. It's status is faulted. I did zdb and it is in that result as well whereas the zdb -l for each member of the newly named pool is correct. How do I purge/destroy the faulted pool?
Code:
[root@rome] ~# zpool list
NAME     SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
backup  2.72T  2.19T   542G    80%  1.00x  ONLINE  /mnt
rename  3.62T  3.26T   373G    89%  1.00x  ONLINE  /mnt
tank        -      -      -      -      -  FAULTED  -


See the status of the newly renamed pool.
Code:
[root@rome] ~# zpool status rename
  pool: rename
state: ONLINE
  scan: scrub repaired 0 in 8h46m with 0 errors on Sun Sep 15 08:46:41 2013
config:
 
    NAME                                            STATE    READ WRITE CKSUM
    rename                                          ONLINE      0    0    0
      raidz1-0                                      ONLINE      0    0    0
        gptid/9d37c588-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
        gptid/9dbe3467-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
        gptid/9e5ad759-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
        gptid/9f248126-d9f0-11e2-b93b-6805ca09f08c  ONLINE      0    0    0
 
errors: No known data errors


And status of old faulted pool which I'm trying to eliminate.
Code:
[root@rome] ~# zpool status tank
  pool: tank
state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
    replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
  see: http://www.sun.com/msg/ZFS-8000-3C
  scan: none requested
config:
 
    NAME                      STATE    READ WRITE CKSUM
    tank                      UNAVAIL      0    0    0
      raidz1-0                UNAVAIL      0    0    0
        439340949831327204    UNAVAIL      0    0    0  was /dev/gptid/76708775-5e7d-11e2-8efb-6805ca09f08c
        replacing-1            UNAVAIL      0    0    0
          2323197472043909345  UNAVAIL      0    0    0  was /dev/gptid/af99d237-4b31-11e2-b9ae-6805ca09f08c
          2075023451200656815  UNAVAIL      0    0    0  was /dev/gptid/1735b055-c9a5-11e2-b4d4-6805ca09f08c
        13740987679321907475  UNAVAIL      0    0    0  was /dev/gptid/b0400532-4b31-11e2-b9ae-6805ca09f08c
        replacing-3            UNAVAIL      0    0    0
          2348113808455552577  OFFLINE      0    0    0  was /dev/dsk/gptid/b0b5f7cf-4b31-11e2-b9ae-6805ca09f08c
          2714461895960674896  UNAVAIL      0    0    0  was /dev/gptid/9fb695f4-adb0-11e0-ae66-0025900b7d98
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
How do I purge/destroy the faulted pool?
Did you export the old pool from the GUI? Do not mark as new.

For each disk run:
Code:
zdb -l adaX

zdb -l adaXp1
You already check adaXp2 for each disk and the four labels are all consistent, yes?
 

xMx

Cadet
Joined
Mar 25, 2018
Messages
1
A brute-force solution is the following: (proceed with extreme caution!!)
1) change the cachefile for the valid pools you want to keep (zpool set cachefile=<new cache file> <poolname>), export them.
2) move the zfs cachefile /etc/zfs/zpool.cache to a backup location
3) reboot forcefully or try to restart the zfs deamon.
4) Import the zpools you want to keep from their cachefiles. ( zfs import -c <cachefile> <poolname>).

(I am running Ubuntu 16 and the system-space zfs version=5,zfs-zed/xenial-updates,now 0.6.5.6-0ubuntu19)
 
Last edited:
Status
Not open for further replies.
Top