PANIC at zil.c:423:zil_parse()

A.LeMinh · Feb 22, 2021

Hello,
I'm facing this problem while importing a pool that has unfortunately lost all LOG disks.

this is the output:

Code:


---------------------------------------------------------------------------------------------------------------
truenas# zpool import Primary-Pool
cannot import 'Primary-Pool': pool was previously in use from another system.
Last accessed by lvmtruenas.localdomain (hostid=361b7376) at Sun Feb 21 13:53:04 2021
The pool can be imported, use 'zpool import -f' to import the pool.
---------------------------------------------------------------------------------------------------------------
truenas# zpool import Primary-Pool -f
The devices below are missing or corrupted, use '-m' to import the pool anyway:
            mirror-2 [log]
              6170d7e7-5025-11eb-8786-23dd6f7a6bf1
              617847e3-5025-11eb-8786-23dd6f7a6bf1
              307d3791-4562-492c-890e-538e470bb9e6

cannot import 'Primary-Pool': one or more devices is currently unavailable
---------------------------------------------------------------------------------------------------------------
truenas# zpool import Primary-Pool -f -m
2021 Feb 22 11:04:23 truenas VERIFY(!claimed || !(zh->zh_flags & ZIL_CLAIM_LR_SEQ_VALID) || (max_blk_seq == claim_blk_seq && max_lr_seq == claim_lr_seq) || (decrypt && error == EIO)) failed
2021 Feb 22 11:04:23 truenas PANIC at zil.c:423:zil_parse()

is it possible to save the content of the Pool ?

many many thanks

morganL · Feb 22, 2021

i'd suggest you describe the history of how you got to this state.... it may help someone work out how to get out of the state. Where was the pool created?

jgreco · Feb 22, 2021

Actually, I'm going to do @morganL one better.

Please take a few moments to review the Forum Rules, conveniently linked at the top of every page in red. Included within is a description of how to post a good request for help, which includes a description of your hardware, your software, other important items such as pool configuration, what you are using it for, what's happened, etc.

These bits of information serve as a backdrop to your crisis, and help people understand what is going on. Those bits of information may all seem incredibly obvious to you, but it is not incredibly obvious to anyone else. It is better to provide a bit too much information than too little; if you provide too little information, many people will skip over your message, finding it frustrating to speculate about what might have happened to you. When you provide too much information, you've still given us the stuff we need, and forum members have a good history of digging into the gory details to find those key clues. We cannot do that if you do not provide them.

Please take a few moments to post details about your system, what's transpired, what you are doing with the pool, and related stuff. Help us help you!

Thanks.

A.LeMinh · Feb 23, 2021

Hello and thanks,

here is my hardware&SW:

I'm actually using a VM spawned in an OVIRT environment

HW:
Motherboard: ASRock H470M-ITX/AC
RAM: 32GB (16x2) working at 2666MHz
Storage:
- 2 Mechanical Disks Seagate Technology - IronWolf NAS da 3,5 (4TB each)
- 2 Solid State Disks Samsung Evo 860 (500GB each) (RAID 10 SW aggregated for the Storage_domain A)
- 2 Transcend M.2 NVMe (120GB each) (1 for Hypervisor OS and one for the Storage_domain B )
NET: 1 link 1Gbit
CPU: Intel Core i5-10400

VM:
the Vm was:
RAM: 8GB
Vcpu: 4
Storage:
- 2 Mechanical Disks presented as direct Lun
- 2 Vdisks 20GB each as LOG (from the Storage_domain A)
- 1 Vdisks 20GB each as CACHE (from the Storage_domain A)

the Pool:
I use the Pool to store home documents and pictures it has, if I can remember well, 6 dataset

What Happened:
1) I was Migrating all the disks of the storage_domain A to the storage_domain B through the Ovirt Migration Tool and unfortunately 2 Disks (the LOG ones) went "Broken"
2) The Vm was not able to start again (because of the "Broken" vdisks) so I detached them and start the vm
3) I detached one of the mechanical to use the space for another recover
3) I mounted the Pool with the "-f -m" option but I was overconfident and
- I put offline/detach/remove with success one of the mechanical disks
- I attached 1 new disk from the storage_domain A to the ZIL
- I tried to put offline/detach/remove the unavailable original 2 disks without success
- I thought that the environment was stable and I planned I clean reboot
- After the reboot unfortunately the import operation failed as shown

thank you very much it was my very first post :)

morganL · Feb 24, 2021

Great write-up, but a very complex situation. So each of the original mechanical disks should be one side of a single VDEV mirror?

SweetAndLow · Feb 24, 2021

A.LeMinh said:
I'm actually using a VM spawned in an OVIRT environment

Lol virtual disks. That's never going to play nice with zfs. You're probably dead in the water

morganL · Feb 25, 2021

I was looking here: https://www.freebsd.org/cgi/man.cgi?zpool(8)

You found the zfs import -m option which I thought might work given the loss of the SLOG... however, that wasn't enough.

The -F option might handle the case where the SLOG had data that was missing. I've never used it myself... just interested in the problem. Perhaps others with more experience can comment.

A.LeMinh · Mar 2, 2021

SweetAndLow said:
Lol virtual disks. That's never going to play nice with zfs. You're probably dead in the water

I agree with you, that's why i'm creating all vdevs of the new pool using direct luns..

SweetAndLow · Mar 2, 2021

A.LeMinh said:
I agree with you, that's why i'm creating all vdevs of the new pool using direct luns..

If it's not using the physical disk you are always going to have issues. Lun will not solve your problem

A.LeMinh · Mar 6, 2021

SweetAndLow said:
If it's not using the physical disk you are always going to have issues. Lun will not solve your problem

DirectLuns i'ts a way to call the PD by the VM, in this way the PD are direct attached :)

tlvenn · Mar 6, 2021

Hi,

I am actually facing the same error when I am trying to import my pool.

Hardware:
Motherboard: AsRock Rack C3758D4I-4L
Ram: 64Gb RDIMM
Boot dist: Satadom
Storage:
2 x WD60EFRX (6TB)
1 x ST6000VN001 (6TB)
Pool:
Simple raidz1 with the 3 disks above

Context:
I am trying to migrate from Ubuntu 20.10 to the latest Truenas Scale. The pool is healthy, has been scrubbed with no error on Ubuntu side (Running zfs 8.0.4). Before starting the installation of the new system, i did a zfs export of my pool named "data".

Some more context it that matters given the stacktrace below mentions log block, I used to have a ssd slog device on that pool that i removed sometime ago when zpool remove command became avail.

The zpool import command shows this:

When i try to import it, it crashes:

I can import the same pool with Ubuntu 20.10 without any issue, pretty much instantly.

Any idea what the issue might be ?

Thanks a lot in advance.

morganL · Mar 6, 2021

Ubuntu 20.10 did not use the OpenZFS 2.0 .. It used

zfs-linux 0.8.4-1ubuntu11.1 (main)

There's been no testing for compatibility.

tlvenn · Mar 7, 2021

Ya but zfs jumped from 0.8.6 to 2.0.0 and it would be very weird if the 2.x version is incapable of adopting a pool from the 0.8.x line. Looking at the changelog of ZFS, I actually dont see anything that would suggest it is the case.

And if it was actually not supported, given the amount of metadata on the pool itself, the zfs import command would simply refuse to import that pool but actually says nothing about it but that it does not run the latest feature set and that the pool can actually be upgraded.

So I believe the issue is most probably elsewhere.

morganL · Mar 7, 2021

tlvenn said:
Ya but zfs jumped from 0.8.6 to 2.0.0 and it would be very weird if the 2.x version is incapable of adopting a pool from the 0.8.x line. Looking at the changelog of ZFS, I actually dont see anything that would suggest it is the case.

And if it was actually not supported, given the amount of metadata on the pool itself, the zfs import command would simply refuse to import that pool but actually says nothing about it but that it does not run the latest feature set and that the pool can actually be upgraded.

So I believe the issue is most probably elsewhere.

You could be correct... my observation/information is that there is no testing that we have done for this case. There is a lot of testing of important pools from FreeNAS/TrueNAS CORE.

Important Announcement for the TrueNAS Community.

PANIC at zil.c:423:zil_parse()

A.LeMinh

Cadet

morganL

Captain Morgan

jgreco

Resident Grinch

A.LeMinh

Cadet

morganL

Captain Morgan

SweetAndLow

Sweet'NASty

morganL

Captain Morgan

A.LeMinh

Cadet

SweetAndLow

Sweet'NASty

A.LeMinh

Cadet

tlvenn

Cadet

Attachments

morganL

Captain Morgan

tlvenn

Cadet

morganL

Captain Morgan

Similar threads

Important Announcement for the TrueNAS Community.

PANIC at zil.c:423:zil_parse()

Cadet

Captain Morgan

Resident Grinch

Cadet

Captain Morgan

Sweet'NASty

Captain Morgan

Cadet

Sweet'NASty

Cadet

Cadet

Attachments

Captain Morgan

Cadet

Captain Morgan

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "PANIC at zil.c:423:zil_parse()"

Similar threads