Unable to Import Pool After Reboot

hootyhoot

Cadet
Joined
Jan 9, 2024
Messages
1
I am running into an issue with a pool import after a restart.

Context: I have a 5x8TB raidz1 array (i know) and one of the drives failed. I replaced the drive and resilver froze at ~95%. I performed a reboot as it was mentioned it is safe to do so during a resilver. Upon reboot, the start job for Import ZFS Pools produced a warning:

WARNING: Pool 'storage' has encountered an uncorrectable I/O failure and has been suspended.

I shutoff the server, unplugged the drives and booted normally. I was able to export the pool via GUI, but when attempting to import, there is a hanging task to import and it causes the GUI to become unresponsive. I get the same behavior when I run the import command via CLI.

zpool status output:
root@truenas[/home/admin]# zpool status
pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:00:10 with 0 errors on Tue Jan 9 03:45:11 2024
config:

NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme0n1p3 ONLINE 0 0 0
nvme1n1p3 ONLINE 0 0 0

errors: No known data errors

pool: ssd-storage
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: scrub repaired 0B in 00:02:45 with 0 errors on Sun Dec 17 00:02:46 2023
config:

NAME STATE READ WRITE CKSUM
ssd-storage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme0n1p5 ONLINE 0 0 0
nvme1n1p5 ONLINE 0 0 0

errors: No known data errors

zpool import output:
root@truenas[/home/admin]# zpool import
pool: storage
id: 7305793554303295809
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.
config:

storage ONLINE
raidz1-0 ONLINE
sdb ONLINE
sde ONLINE
63cf87f7-a3b8-468b-9c07-af2c82d425e2 ONLINE
sdd ONLINE
42a25b1c-ca62-4050-9092-e160cd5e1e23 ONLINE

I am able to import with the -o readonly=on flag, but my data is not accessible. I noticed that there is only 16TB (2 8TB drives) available when running df -h on the mount directory.

Permanent error detected when running zpool status -xv storage:
root@truenas[/home/admin]# zpool status -xv
pool: storage
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Jan 8 23:17:29 2024
11.4T / 15.3T scanned, 11.4T / 15.3T issued
2.27T resilvered, 74.16% done, no estimated completion time
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdb ONLINE 0 0 0
sde ONLINE 0 0 0
63cf87f7-a3b8-468b-9c07-af2c82d425e2 ONLINE 0 0 0
sdd ONLINE 0 0 0
42a25b1c-ca62-4050-9092-e160cd5e1e23 ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

<metadata>:<0x10a0>
I cannot scrub because the pool only imports as read only

I ran fdisk -l and noticed only 2 of the 5 drives have a partition:
Disk /dev/sda: 7.28 TiB, 8001563222016 bytes, 15628053168 sectors
Disk model: WDC WD80EMZZ-11B
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 6F205583-8712-49BD-AE94-AB6AA58601AE

Device Start End Sectors Size Type
/dev/sda1 40 15628053134 15628053095 7.3T Solaris /usr & Apple ZFS


Disk /dev/sdb: 7.28 TiB, 8001563222016 bytes, 15628053168 sectors
Disk model: WDC WD80EFAX-68L
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sdc: 7.28 TiB, 8001563222016 bytes, 15628053168 sectors
Disk model: WDC WD80EFAX-68L
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 52AEB779-DA35-4E13-B851-296232E4FA8C

Device Start End Sectors Size Type
/dev/sdc1 40 15628053134 15628053095 7.3T Solaris /usr & Apple ZFS


Disk /dev/sdd: 7.28 TiB, 8001563222016 bytes, 15628053168 sectors
Disk model: WDC WD80EFAX-68L
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/sde: 7.28 TiB, 8001563222016 bytes, 15628053168 sectors
Disk model: WDC WD80EFAX-68L
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Next I ran gpt -l /dev/sdb on one of the drives without a partition:
root@truenas[/home/admin]# gdisk /dev/sdb
GPT fdisk (gdisk) version 1.0.9

Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!

Warning: Invalid CRC on main header data; loaded backup partition table.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!
Main header: ERROR
Backup header: OK
Main partition table: ERROR
Backup partition table: OK

Partition table scan:
MBR: not present
BSD: not present
APM: not present
GPT: damaged

Found invalid MBR and corrupt GPT. What do you want to do? (Using the
GPT MAY permit recovery of GPT data.)
1 - Use current GPT
2 - Create blank GPT

Your answer:
All of the 3 drives without a partition have the same issue.

So, how screwed am I? Is it possible to recover this?
 
Top