Kernel Panic when trying to destroy dataset

sammael · Jul 9, 2023

Hi,
I have 2 truenas scale's with the "truenas1" replicating 2 datasets to the "truenas2". This has worked without any issue for months and survived all the upgrades. Currently on 22.12.3.2.

Yesterday I woke up to "truenas2" being in a reboot loop. Long story short I narrowed it down to of all things 1 replication task. As it was trying to reconnect it was crashing "truenas2". Since there is inexplicably no convenient way to STOP running replication task, I did what most of the threads I was able to find suggested - killed the zfs send / zfs receive processes on both machines. This stopped the crashing. Whenever I tried to manually start the replication it would crash "truenas2". Since the data is also backed up elsewhere I decided to delete the snapshot task, replication task on "truenas1" and dataset on "truenas2" and re-replicate it.

Herein a host of issues begin. Trying to delete the dataset produces kernel panic, the zfs destroy process becomes stuck and unkillable, and the machine is unable to reboot or shut down (I left it sitting for an hour after issuing the shutdown command.) Everything else works, but the panic keeps repeating about every 5 minutes-ish

Code:

[10392.560613] INFO: task txg_sync:3396 blocked for more than 1208 seconds.
[10392.560629]       Tainted: P           OE     5.15.107+truenas #1
[10392.560637] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10392.560642] task:txg_sync        state:D stack:    0 pid: 3396 ppid:     2 flags:0x00004000
[10392.560652] Call Trace:
[10392.560656]  <TASK>
[10392.560662]  __schedule+0x2f0/0x950
[10392.560674]  schedule+0x5b/0xd0
[10392.560681]  vcmn_err.cold+0x66/0x68 [spl]
[10392.560702]  ? spl_kmem_cache_alloc+0x36/0x100 [spl]
[10392.560718]  ? bt_grow_leaf+0xdc/0xe0 [zfs]
[10392.560841]  ? pn_free+0x30/0x30 [zfs]
[10392.561000]  ? zfs_btree_find_in_buf+0x59/0xb0 [zfs]
[10392.561118]  zfs_panic_recover+0x6d/0x90 [zfs]
[10392.561296]  range_tree_add_impl+0x168/0x570 [zfs]
[10392.561454]  ? mutex_lock+0xe/0x30
[10392.561461]  ? __raw_spin_unlock+0x5/0x10 [zfs]
[10392.561628]  ? list_head+0x9/0x30 [zfs]
[10392.561748]  metaslab_free_concrete+0x115/0x250 [zfs]
[10392.561902]  metaslab_free_impl+0xad/0xe0 [zfs]
[10392.562055]  metaslab_free+0x168/0x190 [zfs]
[10392.562212]  zio_free_sync+0xde/0xf0 [zfs]
[10392.562398]  dsl_scan_free_block_cb+0x66/0x1b0 [zfs]
[10392.562546]  bpobj_iterate_blkptrs+0x102/0x320 [zfs]
[10392.562664]  ? dsl_scan_free_block_cb+0x1b0/0x1b0 [zfs]
[10392.562810]  bpobj_iterate_impl+0x243/0x3a0 [zfs]
[10392.562928]  ? dsl_scan_free_block_cb+0x1b0/0x1b0 [zfs]
[10392.563074]  dsl_process_async_destroys+0x2cf/0x570 [zfs]
[10392.563221]  dsl_scan_sync+0x1dd/0x8e0 [zfs]
[10392.563369]  ? kfree+0x1fc/0x250
[10392.563375]  spa_sync_iterate_to_convergence+0x11f/0x1e0 [zfs]
[10392.563537]  spa_sync+0x2e9/0x5d0 [zfs]
[10392.563696]  txg_sync_thread+0x229/0x2a0 [zfs]
[10392.563866]  ? txg_dispatch_callbacks+0xf0/0xf0 [zfs]
[10392.564034]  thread_generic_wrapper+0x59/0x70 [spl]
[10392.564052]  ? __thread_exit+0x20/0x20 [spl]
[10392.564067]  kthread+0x127/0x150
[10392.564074]  ? set_kthread_struct+0x50/0x50
[10392.564078]  ret_from_fork+0x22/0x30
[10392.564087]  </TASK>

I tried: multiple reboots (no change), disabling all apps and sharing services (none of which access the pool in question), scrubbing (no errors), crying (no effect), deleting the files from inside the dataset to at least recover the space (no can do - read only filesystem)

I have also discovered that despite deleting both the periodic snapshot task and replication task and all snapshots of the dataset in question from "truenas1", and the fact there has NEVER been an periodic snapshot task on "truenas2", between the reboots there keep appearing snapshots of the dataset on "truenas2". Trying to delete these produces kernel panic as well, but after reboot they seem to be gone, but a new one appeared again.

As the other replication dataset is 26TB and I only have 1G network, I'd rather find a solution other than destroying the pool altogether (which I think should "fix" it?).

Any suggestion welcome!

samarium · Jul 9, 2023

I agree trashing the pool will likely fix the problem, which is probably some sort of pool corruption, but as last resort.

A panic usually is the last gasp message just before the kernel crashes. You are seeing a kernel stack back trace from a blocked kernel task, the transaction group sync which normally happens every 5 seconds.

I expect you have checked around these messages are there aren't any other kernel messages around where this is happening maybe indicating some other hardware or kernel software issue?

I also expect you have double checked tn1 for replication tasks pushing to tn2. Hopefully you have also check in case there is a replication task on tn2 pulling data from tn1, just for completeness, and also check there are no snapshot tasks on tn2 which might generate snapshots?

Seems like you need to stop the new snapshots appearing first? Don't need extra snapshots complicating things. You say there is a disabled/deleted replication task on tn1 than would send snaps to tn2? But you are still getting snaps appearing? Are the snaps normally replicated over ssh? Can you disable sshd on tn2? Can you disable authenticion key on tn2 ssh key that is used by tn1 to send snaps? Does this stop the snaps appearing? If so then it seems you have more work to track down the snaps on tn1.

sammael · Jul 9, 2023

There weren't any other kernel messages apart from the hung task. I tried disabling the ssh service and removing the key to no avail. There are no, nor there ever were tasks pushing from tn2 to tn1, it's all one way from tn1 to tn2. tn2 is just a raidz2 target for backup and it runs truenas because I wanted to leverage the zfs replication + I'm used to it and run ~15-ish apps on each tn on their own mirrored m2 ssd pools.

I've seen some threads about replication task causing crashes/panics on reddit and here as well, and I have to confess as a hobbyist homelab user setting up replication task was the most hostile user unfriendly thing in truenas I've experienced yet, not to mention how the auth keys seem to go missing every other upgrade and replication failing and needing enabling replication from scratch despite no data changed on source dataset etc. For comparison my rsync tasks are literally "set and forget" and been syncing data for years, even back from when "truenas1" was core not scale.

In the end I just destroyed the pool and recreated it with 1 dataset for the sole use as target of the replication task from "truenas1" (all others were converted to rsync tasks previously - I just didn't want to deal with moving 26TB of data around, but that's moot now). Should that fail in the future I'll just convert that one to rsync as well and forget zfs replication exists.

samarium · Jul 9, 2023

Problem solved it seems.

I agree replication setup can be convoluted, have to do ssh key setup, then ssh connection setup, then replication task setup, and in that order.

I think ZFS replication is a better solution in general, but rsync works easier for you, and it is your system.

jgreco · Jul 9, 2023

sammael said:
Trying to delete the dataset produces kernel panic, the zfs destroy process becomes stuck and unkillable, and the machine is unable to reboot or shut down (I left it sitting for an hour after issuing the shutdown command.) Everything else works, but the panic keeps repeating about every 5 minutes-ish

Please do fill out a bug report (see "Report a Bug" at the top of page) and post the issue number here if possible, It's important that ZFS should avoid bring the system down, pool corruption may be an exception but even there it is desirable for the system to be as resilient as possible.

sammael · Jul 9, 2023

@samarium, indeed and as it is just backup of my movies not some critical production data I just went rsync since I've already rsync'd 5TB into tn2, before I tried to re-replicate, but I forgot to encrypt the pool, but the source is encrypted and I just don't have it in me to redo it again. I mean it's supposed to be a backup yet I had to wipe it so that's no good. Rsync for me all the way now:)

@jgreco https://ixsystems.atlassian.net/browse/NAS-122883

ikarlo · Jul 11, 2023

HI,
I have same problem with the exact same setup (both systems, source and destination, are on Truenas scale 22.12.3.2).
ZFS panic on target system while trying to destroy a old snapshot, manually or automatically through retention on replication jobs.

After another reboot truenas try to mount pool without success with same kernel messages of Sammael.

This is a big problem, please please find a fix as soon as possible.

Thanks
Carlo

samarium · Jul 12, 2023

ikarlo said:
HI,
I have same problem with the exact same setup (both systems, source and destination, are on Truenas scale 22.12.3.2).
ZFS panic on target system while trying to destroy a old snapshot, manually or automatically through retention on replication jobs.

After another reboot truenas try to mount pool without success with same kernel messages of Sammael.

This is a big problem, please please find a fix as soon as possible.

Thanks
Carlo

you should comment on the jira ticket if you have same issue

ikarlo · Jul 12, 2023

samarium said:
you should comment on the jira ticket if you have same issue

Ok sent my comment,
thanks
Carlo

TempleHasFallen · Jul 17, 2023

I'm also facing the same issue when pushing a replication from TN1 (22.12.3.2) to TN2 (Core 13.0-U5.2).
However a the time of my initial push, my TN1 box also resets.

Is the jira ticket private? I cannot see the contents

ikarlo · Jul 17, 2023

TempleHasFallen said:
I'm also facing the same issue when pushing a replication from TN1 (22.12.3.2) to TN2 (Core 13.0-U5.2).
However a the time of my initial push, my TN1 box also resets.

Is the jira ticket private? I cannot see the contents

I don't know, since yesterday, I can't read the ticket anymore

sammael · Jul 17, 2023

I can still read the ticket (be weird if I couldn't as I was the one who opened it) and there's nothing new, apart from someone being assigned to it.

Also surprised it doesn't get more coverage / faster resolution, to me it seems like quite a serious issue, indeed in all my years of using TN since Core 9 this is the only issue where I "lost" data (had to destroy the whole pool on target truenas, quotes because I lost a backup of a data, so didn't lose anything in reality, but still troublesome.)

I pretty much lost all faith in zfs replication (absolutely unbelievable massive pain to set up compared to rsync), and even if they "fix" it I'm sticking with rsync that hasn't ever failed me like this.

edit: the Jira ticket lists the priority as Low, so /shrug

samarium · Jul 17, 2023

If you look at @TempleHasFallen ticket @ikarlo has post a GitHub issue an at the bottom the is a PR from I believe a TN developer, you should have a read

TempleHasFallen · Jul 18, 2023

Yes, so a TN dev has already submitted a PR to ZFS for this, but until this is merged and patched in TrueNAS it may take a while... hopefully its very soon

Fix raw receive with different indirect block size. by amotin · Pull Request #15039 · openzfs/zfs

Unlike regular receive, raw receive require destination to have the same block structure as the source. In case of dnode reclaim this triggers two special cases, requiring special handling: If dn...

github.com

jgreco · Jul 18, 2023

sammael said:
Also surprised it doesn't get more coverage / faster resolution, to me it seems like quite a serious issue, indeed in all my years of using TN since Core 9 this is the only issue where I "lost" data (had to destroy the whole pool on target truenas, quotes because I lost a backup of a data, so didn't lose anything in reality, but still troublesome.)

These things usually aren't simple problems, and rushing to find a solution without actually fully understanding what is going on is rarely advisable.

sammael said:
I pretty much lost all faith in zfs replication (absolutely unbelievable massive pain to set up compared to rsync), and even if they "fix" it I'm sticking with rsync that hasn't ever failed me like this.

Replication is awesome for some specific use cases as it is more easily able to take advantage of stuff like only copying modified blocks; this means it can handle block storage or snapshots or other similar tricky ZFS stuff. rsync isn't able to do a lot of that, but on the other hand, rsync is a whole lot less fragile. If you're looking for file-based copy/backups, and can cope with stuff like snapshots on your own, then definitely rsync has a lot of upsides.

I find it easier to design file storage around rsync's relatively minor quirks. The place it can hurt you is if you have large files; rsync is unable to easily recognize stuff such as a file being moved and may want to recopy the entire file.

ikarlo · Jul 19, 2023

Until the fix is released, what are the best ways to restore replication without disabling encryption of the source dataset?

I think about disabling "Include dataset properties" in the replication settings.
This way I believe an unencrypted stream arrives at the destination to avoid PANIC issue.
What do you think?

TempleHasFallen · Jul 19, 2023

ikarlo said:
Until the fix is released, what are the best ways to restore replication without disabling encryption of the source dataset?

I think about disabling "Include dataset properties" in the replication settings.
This way I believe an unencrypted stream arrives at the destination to avoid PANIC issue.
What do you think?

This is what I am currently doing.

- Disabled "Include dataset properties"
- Enable Encryption, set key or passphrase
- Keep in mind dataset has to be unlocked on remote system after boot unless saved.

I had to destroy my target pool completely to get rid of the broken datasets.

ikarlo · Jul 19, 2023

TempleHasFallen said:
- Enable Encryption, set key or passphrase

Do you mean that the pool on the remote system can remain encrypted even if it contains unencrypted replicated datasets?

TempleHasFallen · Jul 19, 2023

ikarlo said:
Do you mean that the pool on the remote system can remain encrypted even if it contains unencrypted replicated datasets?

No, I'm referring to setting encryption in your replication job as such:

Basically it will encrypt the data with the set passphrase/key before it is written on the target disk.
On the first time replication, the dataset on the target will show unlocked, afterwards you will have to unlock the dataset with the key/passphrase after boot in order for future replications to work.

ikarlo · Jul 19, 2023

Ok, so I can recreate the target pool encrypted, with its own key, as before?

Important Announcement for the TrueNAS Community.

Kernel Panic when trying to destroy dataset

Explorer

Contributor

Explorer

Contributor

Resident Grinch

Explorer

Dabbler

Contributor

Dabbler

Dabbler

Dabbler

Explorer

Contributor

Dabbler

Resident Grinch

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Kernel Panic when trying to destroy dataset"

Similar threads