Pool offline. Integrity check failed

mdeheus

Cadet
Joined
Feb 7, 2022
Messages
3
Hello,

I recently set up TrueNas on a HP Microserver in order to replace my failing QNAP. Truenas is running as a VM under ESXi 7.0.3
The Datastore on ESX where this pool is on was a logical disk on one 3TB drive which is in RAID0 (single drives on a HP P222 RAID controller are RAID0 by default)
I added a second 3TB drive on the RAID controller in order to expand the RAID. This shouldn't have any influence (add disk to the RAID pool first and once added you can specify what to do with it.), but for some reason when it finished adding it took the logical disk offline thus leaving the datastore and VM disk inaccessible to Truenas.
I shut down TrueNas and had to do a power off of the VM as it wouldn't finish the shutdown (no output on screen. CPU and memory usage were close to 0. I assumed a missing terminate signal). I then expanded the RAID from the RAID controller, rebooted the machine which started ESXI, expanded the Datastore and the VM Disk and booted the TrueNas VM.
Truenas was now showing the pool as offline and I tried troubleshooting it using the TrueNas forums.

Using zpool import it is showing the pool as that it should be importable
[/code]
root@truenas[/]# zpool import
pool: storage
id: 17907705850229598220
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

storage ONLINE
gptid/e2fb8334-850c-11ec-8eff-000c29e2b72e ONLINE
[/code]

but when I try to import it normally, it is showing that a device is busy:
Code:
root@truenas[/]# zpool import storage
cannot import 'storage': one or more devices is currently unavailable

and the same goes for when I try to force it:
Code:
root@truenas[/]# zpool import -f storage
cannot import 'storage': one or more devices is currently unavailable

when I try to import it with recovery I'm receiving an integrity check failed:
Code:
root@truenas[/]# zpool import -F storage
internal error: cannot import 'storage': Integrity check failed
zsh: abort (core dumped)  zpool import -F storage

but running it with -n I'm not getting any output at all

Checking gpart is showing all partitions:
Code:
root@truenas[/]# gpart show
=>      40  33554352  da0  GPT  (16G)
        40      1024    1  freebsd-boot  (512K)
      1064  33521664    2  freebsd-zfs  (16G)
  33522728     31664       - free -  (15M)

=>        40  3865470488  da1  GPT  (1.8T)
          40          88       - free -  (44K)
         128     4194304    1  freebsd-swap  (2.0G)
     4194432  3861276096    2  freebsd-zfs  (1.8T)

=>         40  11596411624  da2  GPT  (5.4T)
           40           88       - free -  (44K)
          128      4194304    1  freebsd-swap  (2.0G)
      4194432   5851993440    2  freebsd-zfs  (2.7T)
   5856187872   5740223792       - free -  (2.7T)

=>    17  476131  cd0  MBR  (930M)
      17  476131       - free -  (930M)

=>    17  476131  iso9660/TRUENAS  MBR  (930M)
      17  476131                   - free -  (930M)

Initially da2 was showing as corrupt but this was fixed by using:
gpart recover /dev/da2


Listing all partitions returns the following
Code:
root@truenas[/]# gpart list
Geom name: da0
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 33554391
first: 40
entries: 128
scheme: GPT
Providers:
1. Name: da0p1
   Mediasize: 524288 (512K)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 20480
   Mode: r0w0e0
   efimedia: HD(1,GPT,095db132-5e92-11ec-a80d-000c29e2b72e,0x28,0x400)
   rawuuid: 095db132-5e92-11ec-a80d-000c29e2b72e
   rawtype: 83bd6b9d-7f41-11dc-be0b-001560b84f0f
   label: (null)
   length: 524288
   offset: 20480
   type: freebsd-boot
   index: 1
   end: 1063
   start: 40
2. Name: da0p2
   Mediasize: 17163091968 (16G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 544768
   Mode: r1w1e1
   efimedia: HD(2,GPT,096925dc-5e92-11ec-a80d-000c29e2b72e,0x428,0x1ff8000)
   rawuuid: 096925dc-5e92-11ec-a80d-000c29e2b72e
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 17163091968
   offset: 544768
   type: freebsd-zfs
   index: 2
   end: 33522727
   start: 1064
Consumers:
1. Name: da0
   Mediasize: 17179869184 (16G)
   Sectorsize: 512
   Mode: r1w1e2

Geom name: da1
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 3865470527
first: 40
entries: 128
scheme: GPT
Providers:
1. Name: da1p1
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 65536
   Mode: r1w1e1
   efimedia: HD(1,GPT,78463898-85ff-11ec-8eff-000c29e2b72e,0x80,0x400000)
   rawuuid: 78463898-85ff-11ec-8eff-000c29e2b72e
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 2147483648
   offset: 65536
   type: freebsd-swap
   index: 1
   end: 4194431
   start: 128
2. Name: da1p2
   Mediasize: 1976973361152 (1.8T)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 2147549184
   Mode: r1w1e2
   efimedia: HD(2,GPT,784ed2de-85ff-11ec-8eff-000c29e2b72e,0x400080,0xe62665c0)
   rawuuid: 784ed2de-85ff-11ec-8eff-000c29e2b72e
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 1976973361152
   offset: 2147549184
   type: freebsd-zfs
   index: 2
   end: 3865470527
   start: 4194432
Consumers:
1. Name: da1
   Mediasize: 1979120929792 (1.8T)
   Sectorsize: 512
   Mode: r2w2e5

Geom name: da2
modified: false
state: OK
fwheads: 255
fwsectors: 63
last: 11596411663
first: 40
entries: 128
scheme: GPT
Providers:
1. Name: da2p1
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 65536
   Mode: r0w0e0
   efimedia: HD(1,GPT,e2eebb81-850c-11ec-8eff-000c29e2b72e,0x80,0x400000)
   rawuuid: e2eebb81-850c-11ec-8eff-000c29e2b72e
   rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 2147483648
   offset: 65536
   type: freebsd-swap
   index: 1
   end: 4194431
   start: 128
2. Name: da2p2
   Mediasize: 2996220641280 (2.7T)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 2147549184
   Mode: r0w0e0
   efimedia: HD(2,GPT,e2fb8334-850c-11ec-8eff-000c29e2b72e,0x400080,0x15cce5560)
   rawuuid: e2fb8334-850c-11ec-8eff-000c29e2b72e
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: (null)
   length: 2996220641280
   offset: 2147549184
   type: freebsd-zfs
   index: 2
   end: 5856187871
   start: 4194432
Consumers:
1. Name: da2
   Mediasize: 5937362789376 (5.4T)
   Sectorsize: 512
   Mode: r0w0e0

I checked the rawuuid and it matches the uuid that the pool is expecting.

To top things of I have also added my console.log file from /var/log/

I assume something got corrupted when the logical disk went offline thus taking down the virtual disk, but I can't figure out how to get this resolved. The datastore itself is healthy and the virtual disk is also healthy so my guess is that something is in the way for Truenas to mount the pool again. What can I do to try and fix this?
 

Attachments

  • console.zip
    4.6 KB · Views: 224

mdeheus

Cadet
Joined
Feb 7, 2022
Messages
3
Just thinking. The pool is still showing up in the GUI of TrueNas as offline.
Would it work if I click on export/disconnect and then click on add and try to import it again? I can imagine that the import from shell could be failing as technically it still is imported.
1644237778279.png
 

mdeheus

Cadet
Joined
Feb 7, 2022
Messages
3
I created a snapshot in VMWare and tried to disconnect the pool and reconnect it again. This didn't work and I then restored the snapshot.

When I use ZDB to check the pool it states that label 0 and 1 can be unpacked, but that 2 and 3 can't. This might be why it isn't mounting
Code:
root@truenas[/]# zdb -l /dev/da2p2
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'storage'
    state: 0
    txg: 56418
    pool_guid: 17907705850229598220
    errata: 0
    hostid: 256246684
    hostname: ''
    top_guid: 14963143195722883585
    guid: 14963143195722883585
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 14963143195722883585
        path: '/dev/gptid/e2fb8334-850c-11ec-8eff-000c29e2b72e'
        metaslab_array: 68
        metaslab_shift: 34
        ashift: 12
        asize: 2996215742464
        is_log: 0
        create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1
failed to unpack label 2
failed to unpack label 3

Am I correct that labels 0, 1, 2 and 3 are all the same? Is there some way to copy the working labels over to the ones that can't be unpacked? Or is there some way I can reset labels 2 and 3?
 
Top