Pool is FAULTED (metadata is corrupted) and 1 Unassigned Disks

munghauzen

Cadet
Joined
Jan 9, 2024
Messages
6
Hi,
while TrueNAS (SCALE-22.12.4.2) was running, a couple of disks failed and the pool became inaccessible. NAS is running in a VMware based virtual machine, RAID is hardwired. I realize that this configuration is wrong, but this is the situation. After powering up again, the pool became unavailable and 1 disk was not assigned. Attempts to import the pool were unsuccessful.

I can try to mount the disk in another virtual machine, or recreate the pool without removing the disk. Will I not lose data with these operations? 25Tb has nowhere to backup.

Code:
root@truenas[~]# zpool import
   pool: basic
     id: 9811254123107785953
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
    The pool may be active on another system, but can be imported using
    the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

    basic                                   FAULTED  corrupted data
      0a4b9dbb-323a-4954-b7f2-14eaead24ef1  ONLINE


Code:
root@truenas[~]# zpool import -f basic
cannot import 'basic': insufficient replicas
    Destroy and re-create the pool from
    a backup source.


Code:
root@truenas[~]# zpool status -v
  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:43 with 0 errors on Mon Jan  1 03:45:45 2024
config:

    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      sda3      ONLINE       0     0     0

errors: No known data errors

  pool: fast
 state: ONLINE
  scan: scrub repaired 0B in 00:01:07 with 0 errors on Sun Dec 24 00:01:08 2023
config:

    NAME                                    STATE     READ WRITE CKSUM
    fast                                    ONLINE       0     0     0
      555b047f-04f7-4643-afa9-028679b8dd84  ONLINE       0     0     0

errors: No known data errors

  pool: nvme
 state: ONLINE
  scan: scrub repaired 0B in 00:07:49 with 0 errors on Sun Dec 24 00:07:50 2023
config:

    NAME                                    STATE     READ WRITE CKSUM
    nvme                                    ONLINE       0     0     0
      c435c64d-d7d7-4930-b903-501f136827d4  ONLINE       0     0     0

errors: No known data errors

  pool: unimportant
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 11:24:09 with 0 errors on Sun Apr 30 11:24:10 2023
config:

    NAME                                    STATE     READ WRITE CKSUM
    unimportant                             ONLINE       0     0     0
      3fe7226e-62bf-44b0-a5ee-7321999c0691  ONLINE       0     0     8

errors: Permanent errors have been detected in the following files:

        unimportant/v2:<0x0>
root@truenas[~]# zpool import
   pool: basic
     id: 9811254123107785953
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
    The pool may be active on another system, but can be imported using
    the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

    basic                                   FAULTED  corrupted data
      0a4b9dbb-323a-4954-b7f2-14eaead24ef1  ONLINE
root@truenas[~]# zpool status -v basic
cannot open 'basic': no such pool
root@truenas[~]# zpool import -f basic
cannot import 'basic': insufficient replicas
    Destroy and re-create the pool from
    a backup source.
root@truenas[~]# zpool import basic
cannot import 'basic': insufficient replicas
    Destroy and re-create the pool from
    a backup source.
root@truenas[~]# zpool status
  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:43 with 0 errors on Mon Jan  1 03:45:45 2024
config:

    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      sda3      ONLINE       0     0     0

errors: No known data errors

  pool: fast
 state: ONLINE
  scan: scrub repaired 0B in 00:01:07 with 0 errors on Sun Dec 24 00:01:08 2023
config:

    NAME                                    STATE     READ WRITE CKSUM
    fast                                    ONLINE       0     0     0
      555b047f-04f7-4643-afa9-028679b8dd84  ONLINE       0     0     0

errors: No known data errors

  pool: nvme
 state: ONLINE
  scan: scrub repaired 0B in 00:07:49 with 0 errors on Sun Dec 24 00:07:50 2023
config:

    NAME                                    STATE     READ WRITE CKSUM
    nvme                                    ONLINE       0     0     0
      c435c64d-d7d7-4930-b903-501f136827d4  ONLINE       0     0     0

errors: No known data errors

  pool: unimportant
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 11:24:09 with 0 errors on Sun Apr 30 11:24:10 2023
config:

    NAME                                    STATE     READ WRITE CKSUM
    unimportant                             ONLINE       0     0     0
      3fe7226e-62bf-44b0-a5ee-7321999c0691  ONLINE       0     0    12

errors: 1 data errors, use '-v' for a list
root@truenas[~]# zpool status -м
invalid option '�'
usage:
    status [-c [script1,script2,...]] [-igLpPstvxD]  [-T d|u] [pool] ...
        [interval [count]]
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You have a single-disk pool... because of this:
NAS is running in a VMware based virtual machine, RAID is hardwired
If that "single disk" is corrupt, you have nowhere to recover from with ZFS.

If the data is important to you, you will need to use a ZFS recovery tool like Klennet. (although with your setup, even that will be difficult). Sorry to say that you will probably lose your data.

Next time, you should follow the guidance for a virtualized instance.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
How do I know if a disk is corrupted or not?

The zpool import gave you this result:

Code:
status: The pool metadata is corrupted.


You can attempt a zpool import -fFn basic (little "f" for "force import", big "F" being "Force Recovery", and little "n" for "no-op" or "don't actually do the import, just display if it's possible")

However, based on the initial description of

a couple of disks failed and the pool became inaccessible

If several disks underpinning a hardware RAID failed, recovery will be determined first by the ability of that hardware RAID to survive that failure. Assuming that the virtual RAID volume is viable, you may be able to "rewind" the ZFS transactions, but may lose some data from this.
 

munghauzen

Cadet
Joined
Jan 9, 2024
Messages
6
You can attempt a zpool import -fFn basic (little "f" for "force import", big "F" being "Force Recovery", and little "n" for "no-op" or "don't actually do the import, just display if it's possible")
I tried running this command. The system thinks for a couple of seconds and gives no response
Code:
root@truenas[~]# zpool import -fFn basic
root@truenas[~]#


If several disks underpinning a hardware RAID failed, recovery will be determined first by the ability of that hardware RAID to survive that failure. Assuming that the virtual RAID volume is viable, you may be able to "rewind" the ZFS transactions, but may lose some data from this.
I restarted the raid with the same disks. Other data and even a different pool on that disk - started without problems. It remains unclear why this pool didn't start, possibly due to corruption....
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I tried running this command. The system thinks for a couple of seconds and gives no response

No news is good news - it didn't throw a panic or an error at you. Try removing the n and sending in just zpool import -fF basic - follow that with zpool status -v if it appears to import.
 

munghauzen

Cadet
Joined
Jan 9, 2024
Messages
6
According to the console, the pool is damaged, not the disk. How connected are the pool and the disk? Can I try mounting the disk in a new pool with the same name on a clean NAS without losing data?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
According to the console, the pool is damaged, not the disk. How connected are the pool and the disk? Can I try mounting the disk in a new pool with the same name on a clean NAS without losing data?

The pool is built on top of the disk - you'll have the same results on a new install of TrueNAS.
 

munghauzen

Cadet
Joined
Jan 9, 2024
Messages
6
No news is good news - it didn't throw a panic or an error at you. Try removing the n and sending in just zpool import -fF basic - follow that with zpool status -v if it appears to import.
Похоже на плохие новости :(

Code:
root@truenas[~]# zpool import -fF basic
cannot import 'basic': insufficient replicas
    Destroy and re-create the pool from
    a backup source.


The pool is built on top of the disk - you'll have the same results on a new install of TrueNAS.
It's also very sad
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Похоже на плохие новости :(

Code:
root@truenas[~]# zpool import -fF basic
cannot import 'basic': insufficient replicas
    Destroy and re-create the pool from
    a backup source.



It's also very sad

Try the following combination:

Code:
echo 0 >> /sys/module/zfs/parameters/spa_load_verify_data
echo 0 >> /sys/module/zfs/parameters/spa_load_verify_metadata
zpool import -fFX basic


The first two lines disable verification of data and metadata. Yes, this is normally a really bad thing to do, but when rewinding pools to earlier transactions, the verification can take "hours to days" on large pools. If this succeeds, check the status output.
 

munghauzen

Cadet
Joined
Jan 9, 2024
Messages
6
Try the following combination:

Code:
echo 0 >> /sys/module/zfs/parameters/spa_load_verify_data
echo 0 >> /sys/module/zfs/parameters/spa_load_verify_metadata
zpool import -fFX basic


The first two lines disable verification of data and metadata. Yes, this is normally a really bad thing to do, but when rewinding pools to earlier transactions, the verification can take "hours to days" on large pools. If this succeeds, check the status output.
It didn't work
Code:
root@truenas[~]# echo 0 >> /sys/module/zfs/parameters/spa_load_verify_data
root@truenas[~]# echo 0 >> /sys/module/zfs/parameters/spa_load_verify_metadata
root@truenas[~]# zpool import -fFX basic
cannot import 'basic': one or more devices is currently unavailable

The other import options didn't help either. But the devices are available
Screenshot 2024-01-09 at 20.12.39.png

Screenshot 2024-01-09 at 20.25.01.png
 
Top