PANIC at zil.c:423:zil_parse()

A.LeMinh

Cadet
Joined
Feb 22, 2021
Messages
4
Hello,
I'm facing this problem while importing a pool that has unfortunately lost all LOG disks.

this is the output:

Code:

---------------------------------------------------------------------------------------------------------------
truenas# zpool import Primary-Pool
cannot import 'Primary-Pool': pool was previously in use from another system.
Last accessed by lvmtruenas.localdomain (hostid=361b7376) at Sun Feb 21 13:53:04 2021
The pool can be imported, use 'zpool import -f' to import the pool.
---------------------------------------------------------------------------------------------------------------
truenas# zpool import Primary-Pool -f
The devices below are missing or corrupted, use '-m' to import the pool anyway:
            mirror-2 [log]
              6170d7e7-5025-11eb-8786-23dd6f7a6bf1
              617847e3-5025-11eb-8786-23dd6f7a6bf1
              307d3791-4562-492c-890e-538e470bb9e6

cannot import 'Primary-Pool': one or more devices is currently unavailable
---------------------------------------------------------------------------------------------------------------
truenas# zpool import Primary-Pool -f -m
2021 Feb 22 11:04:23 truenas VERIFY(!claimed || !(zh->zh_flags & ZIL_CLAIM_LR_SEQ_VALID) || (max_blk_seq == claim_blk_seq && max_lr_seq == claim_lr_seq) || (decrypt && error == EIO)) failed
2021 Feb 22 11:04:23 truenas PANIC at zil.c:423:zil_parse()



is it possible to save the content of the Pool ?

Error_Console_Stack.JPG

many many thanks
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,691
i'd suggest you describe the history of how you got to this state.... it may help someone work out how to get out of the state. Where was the pool created?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Actually, I'm going to do @morganL one better.

Please take a few moments to review the Forum Rules, conveniently linked at the top of every page in red. Included within is a description of how to post a good request for help, which includes a description of your hardware, your software, other important items such as pool configuration, what you are using it for, what's happened, etc.

These bits of information serve as a backdrop to your crisis, and help people understand what is going on. Those bits of information may all seem incredibly obvious to you, but it is not incredibly obvious to anyone else. It is better to provide a bit too much information than too little; if you provide too little information, many people will skip over your message, finding it frustrating to speculate about what might have happened to you. When you provide too much information, you've still given us the stuff we need, and forum members have a good history of digging into the gory details to find those key clues. We cannot do that if you do not provide them.

Please take a few moments to post details about your system, what's transpired, what you are doing with the pool, and related stuff. Help us help you!

Thanks.
 

A.LeMinh

Cadet
Joined
Feb 22, 2021
Messages
4
Hello and thanks,

here is my hardware&SW:

I'm actually using a VM spawned in an OVIRT environment

HW:
Motherboard: ASRock H470M-ITX/AC
RAM: 32GB (16x2) working at 2666MHz
Storage:
- 2 Mechanical Disks Seagate Technology - IronWolf NAS da 3,5 (4TB each)
- 2 Solid State Disks Samsung Evo 860 (500GB each) (RAID 10 SW aggregated for the Storage_domain A)
- 2 Transcend M.2 NVMe (120GB each) (1 for Hypervisor OS and one for the Storage_domain B )
NET: 1 link 1Gbit
CPU: Intel Core i5-10400

VM:
the Vm was:
RAM: 8GB
Vcpu: 4
Storage:
- 2 Mechanical Disks presented as direct Lun
- 2 Vdisks 20GB each as LOG (from the Storage_domain A)
- 1 Vdisks 20GB each as CACHE (from the Storage_domain A)

the Pool:
I use the Pool to store home documents and pictures it has, if I can remember well, 6 dataset

What Happened:
1) I was Migrating all the disks of the storage_domain A to the storage_domain B through the Ovirt Migration Tool and unfortunately 2 Disks (the LOG ones) went "Broken"
2) The Vm was not able to start again (because of the "Broken" vdisks) so I detached them and start the vm
3) I detached one of the mechanical to use the space for another recover
3) I mounted the Pool with the "-f -m" option but I was overconfident and
- I put offline/detach/remove with success one of the mechanical disks
- I attached 1 new disk from the storage_domain A to the ZIL
- I tried to put offline/detach/remove the unavailable original 2 disks without success
- I thought that the environment was stable and I planned I clean reboot
- After the reboot unfortunately the import operation failed as shown

thank you very much it was my very first post :)
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,691
Great write-up, but a very complex situation. So each of the original mechanical disks should be one side of a single VDEV mirror?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I'm actually using a VM spawned in an OVIRT environment
Lol virtual disks. That's never going to play nice with zfs. You're probably dead in the water
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,691
I was looking here: https://www.freebsd.org/cgi/man.cgi?zpool(8)

You found the zfs import -m option which I thought might work given the loss of the SLOG... however, that wasn't enough.

The -F option might handle the case where the SLOG had data that was missing. I've never used it myself... just interested in the problem. Perhaps others with more experience can comment.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I agree with you, that's why i'm creating all vdevs of the new pool using direct luns..
If it's not using the physical disk you are always going to have issues. Lun will not solve your problem
 

tlvenn

Cadet
Joined
Mar 6, 2021
Messages
2
Hi,

I am actually facing the same error when I am trying to import my pool.

Hardware:
Motherboard: AsRock Rack C3758D4I-4L
Ram: 64Gb RDIMM
Boot dist: Satadom
Storage:
2 x WD60EFRX (6TB)
1 x ST6000VN001 (6TB)
Pool:
Simple raidz1 with the 3 disks above

Context:
I am trying to migrate from Ubuntu 20.10 to the latest Truenas Scale. The pool is healthy, has been scrubbed with no error on Ubuntu side (Running zfs 8.0.4). Before starting the installation of the new system, i did a zfs export of my pool named "data".

Some more context it that matters given the stacktrace below mentions log block, I used to have a ssd slog device on that pool that i removed sometime ago when zpool remove command became avail.

The zpool import command shows this:

1615086440860.png


When i try to import it, it crashes:

1615086505346.png


I can import the same pool with Ubuntu 20.10 without any issue, pretty much instantly.

Any idea what the issue might be ?

Thanks a lot in advance.
 

Attachments

  • 1615083930879.png
    1615083930879.png
    146.4 KB · Views: 131
  • 1615084664355.png
    1615084664355.png
    45 KB · Views: 123

tlvenn

Cadet
Joined
Mar 6, 2021
Messages
2
Ya but zfs jumped from 0.8.6 to 2.0.0 and it would be very weird if the 2.x version is incapable of adopting a pool from the 0.8.x line. Looking at the changelog of ZFS, I actually dont see anything that would suggest it is the case.

And if it was actually not supported, given the amount of metadata on the pool itself, the zfs import command would simply refuse to import that pool but actually says nothing about it but that it does not run the latest feature set and that the pool can actually be upgraded.

So I believe the issue is most probably elsewhere.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,691
Ya but zfs jumped from 0.8.6 to 2.0.0 and it would be very weird if the 2.x version is incapable of adopting a pool from the 0.8.x line. Looking at the changelog of ZFS, I actually dont see anything that would suggest it is the case.

And if it was actually not supported, given the amount of metadata on the pool itself, the zfs import command would simply refuse to import that pool but actually says nothing about it but that it does not run the latest feature set and that the pool can actually be upgraded.

So I believe the issue is most probably elsewhere.
You could be correct... my observation/information is that there is no testing that we have done for this case. There is a lot of testing of important pools from FreeNAS/TrueNAS CORE.
 
Top