Disconnecting broken drives from pool not able to restore

Joined
May 27, 2021
Messages
1
Hi,

My TrueNAS system went into infinite reboot with the booting logs showing the following error logs
Screenshot 2021-05-29 194835.png


With this error I suspected that it was one of my drives failing, so I went ahead and start unplugging drives until the system is able to boot itself up.


My pool setup is Raid-Z2, with 4 drives as storage and 2 as spare. I eventually found out I have to unplug 2 of the storage drive in the pool in order to boot into TrueNAS, but now TrueNAS saw saying that the pool is disconnected. Here is the result of running a zpool import
Code:
freenas# zpool import
   pool: Media_Vault
     id: 8772846821260645687
  state: FAULTED
status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
 config:

        Media_Vault                                     FAULTED  corrupted data
          raidz2-0                                      FAULTED  corrupted data
            12576065168032760338                        UNAVAIL  cannot open
            gptid/42925300-05e4-11eb-914e-7824af4787b1  ONLINE
            gptid/428974f0-05e4-11eb-914e-7824af4787b1  ONLINE
            gptid/429e6126-05e4-11eb-914e-7824af4787b1  UNAVAIL  cannot open
        spares
          gptid/37144456-0613-11eb-914e-7824af4787b1
          gptid/37201b28-0613-11eb-914e-7824af4787b1


From my understanding and this post, Raid-Z2 can have up to 2 drives failure and I should be able to bring the data back, so I tried to start the resilver process but got an error regarding that my pool is not online

Code:
freenas# zpool resilver Media_Vault
cannot open 'Media_Vault': no such pool


In order to proceed with the resilver process, I tried to import the with the following ways:
  1. zpool import
    Code:
    freenas# zpool import Media_Vaultinternal error: cannot import 'Media_Vault': Integrity check failed
    Abort trap (core dumped)
  2. zpool import -F
    Code:
    freenas# zpool import -F Media_Vaultcannot import 'Media_Vault': one or more devices is currently unavailable
  3. zpool import -f
    Code:
    freenas# zpool import -f Media_Vaultinternal error: cannot import 'Media_Vault': Integrity check failed
    Abort trap (core dumped)

Now it seemed to me like I got myself into a dead lock situation, if I connect back all my drive, TrueNAS can't boot correctly, and if I disconnect the failing drive, resilver can't happen, am I missing something or doing anything wrong here?

Here are some more information regarding my system:
  • OS: TrueNAS-12.0-U3.1
  • HW:
    • CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
    • Ram: Kingston HyperX 32 GB CL15 DIMM DDR4 2400
    • Drives: A mix of WD Reds and Seagate BarraCuda 2TB drive, total of 6

Many thanks in advance!
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Analyzing the debug provided I found there 5 completely different kernel panics on April 20. Whatever happened to your system was not a couple lost disks, which indeed would not be a problem for RAIDZ2, but probably some more serious memory and data corruption.
 
Top