Crash under busy load - ZFS Arc

TDaxGav

Cadet
Joined
Nov 4, 2021
Messages
1
Running a Truenas Scale setup

Once a day or so when doing a heavy disk transfer the system hangs. Took a while to get to the syslog but I've eventually found it to be something within the ZFS subsystem.

Code:
Nov  4 21:34:54 discovery kernel: CPU: 1 PID: 361 Comm: l2arc_feed Tainted: P   
        OE     5.10.70+truenas #1
Nov  4 21:34:54 discovery kernel: Hardware name: System manufacturer System Prodd
uct Name/P8Z77-V LX, BIOS 2403 03/14/2014
Nov  4 21:34:54 discovery kernel: Call Trace:
Nov  4 21:34:54 discovery kernel:  dump_stack+0x6b/0x83
Nov  4 21:34:54 discovery kernel:  spl_panic+0xd4/0xfc [spl]
Nov  4 21:34:54 discovery kernel:  ? abd_verify+0x61/0x230 [zfs]
Nov  4 21:34:54 discovery kernel:  ? abd_to_buf+0x1a/0x50 [zfs]
Nov  4 21:34:54 discovery kernel:  ? zio_do_crypt_abd+0xb1/0x150 [zfs]
Nov  4 21:34:54 discovery kernel:  l2arc_apply_transforms+0x6ee/0x730 [zfs]
Nov  4 21:34:54 discovery kernel:  l2arc_write_buffers+0x3cd/0xb40 [zfs]
Nov  4 21:34:54 discovery kernel:  l2arc_feed_thread+0x119/0x3d0 [zfs]
Nov  4 21:34:54 discovery kernel:  ? l2arc_remove_vdev+0x2a0/0x2a0 [zfs]
Nov  4 21:34:54 discovery kernel:  thread_generic_wrapper+0x78/0xb0 [spl]
Nov  4 21:34:54 discovery kernel:  ? IS_ERR+0x10/0x10 [spl]
Nov  4 21:34:54 discovery kernel:  kthread+0x11b/0x140
Nov  4 21:34:54 discovery kernel:  ? __kthread_bind_mask+0x60/0x60
Nov  4 21:34:54 discovery kernel:  ret_from_fork+0x22/0x30
Nov  4 21:46:04 discovery syslog-ng[9708]: syslog-ng starting up; version='3.28..


Config of the system

TrueNAS-SCALE-22.02-RC.1-1
CPU - Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Memory - 16GB
13 x 4tb seagate ironwolf (data)
1 x 1tb ssd (cache)

I'm still digging into why it seems to throw a panic, I suspect it is when the l2arc is being transferred from ssd to spinning rust.

I'm not sure why it seems to do it more and more now that there is data on the disks.

Thanks
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I suspect it is when the l2arc is being transferred from ssd to spinning rust.
Transfers shouldn't be going in that direction...

It goes from HDD to ARC (RAM), then if ARC is full, eject oldest/least-used to L2ARC (SSD) from RAM.

If you're running those 13 disks in a single VDEV 13 wide, that's above the recommendation of maximum 12 in a VDEV (but should not be the cause of any panics).
 
Top