I built a new TrueNAS Core machine recently, and I'm having trouble with it regularly crashing and rebooting. I first built it with a fresh install of TrueNAS-13.0-U3.1. Then I spent a week or so replicating all the datasets from my previous TrueNAS machine (currently running on TrueNAS-12.0-U6.1). After the replication finished I've exported the config and the jails from the old machine and imported both on the new machine and also upgraded the jails. This is about when the crashes began. At first I thought it was the jails causing problems being mismatched which is one of the reasons I upgraded them, but the issues continued. I tried disabling different combinations of jails and it seems to have affected the rate at which the crash happens, but finally it crashed earlier today with no jails running. At the time I was doing a bit of a stress test against it by using SMB to copy a few large files and running WinDirStat over the network (measuring disk space usage by subtree) and running a scrub. It crashed relatively quickly after running those things - within 10 minutes or so, so I'm thinking maybe network load has something to do with it. OTOH it didn't have this problem during the week of dataset replication. I have tried running MemTest86 through a few rounds and it didn't find any problems. I haven't yet tried reverting to the state from before importing the config, though it's the next thing I can think of to try.
I've included all the crash dumps I've been saving off. Except for #4, they are all page faults. Most (but not all) of them have ether_nh_input in the stack. And "epair_task" and "swi5: fast taskq" also seems to be showing up a lot - but I'm naive when it comes to analyzing BSD core dumps and might be looking in the wrong places.
Hardware
Motherboard: ASRock B550 PG Velocita
CPU: AMD Ryzen 5 5600G
RAM: 32GB
Hard Drives (storage pool): 6 x WD Red Pro 16TB in Raid Z1
Hard Drive (boot): Teamgroup MP33 512GB
I've included all the crash dumps I've been saving off. Except for #4, they are all page faults. Most (but not all) of them have ether_nh_input in the stack. And "epair_task" and "swi5: fast taskq" also seems to be showing up a lot - but I'm naive when it comes to analyzing BSD core dumps and might be looking in the wrong places.
Hardware
Motherboard: ASRock B550 PG Velocita
CPU: AMD Ryzen 5 5600G
RAM: 32GB
Hard Drives (storage pool): 6 x WD Red Pro 16TB in Raid Z1
Hard Drive (boot): Teamgroup MP33 512GB
Attachments
-
textdump.tar.0.gz60 KB · Views: 285
-
textdump.tar.1.gz69.3 KB · Views: 277
-
textdump.tar.2.gz60.2 KB · Views: 286
-
textdump.tar.3.gz60 KB · Views: 275
-
textdump.tar.4.gz59.7 KB · Views: 289
-
textdump.tar.5.gz60.6 KB · Views: 253
-
textdump.tar.7.gz64.5 KB · Views: 271
-
textdump.tar.8.gz64.6 KB · Views: 280
-
textdump.tar.9.gz59.2 KB · Views: 284
-
textdump.tar.10.gz58.3 KB · Views: 265
Last edited: