Pool seems to be causing reboot loop?

oguruma

Patron
Joined
Jan 2, 2016
Messages
226
TL;DR: TrueNAS is placed into a reboot loop if I boot the box up with 2 or more of my RAID-Z1 pool installed. It runs fine with 1 (or none) of the drives intalled.

I TrueNAS 12 installed on a Kingston SSD. Xeon 1225 with 32G Kingston ECC RAM. Installed in a Supermicro Chassis with redundant PSUs. 3x WD Gold 4TB HDDs in RAIDZ-1.

Worked fine for over a year. No recent hardware or software changes prior to this problem.

This morning, I couldn't connect to the NAS, and I noticed it was in a reboot loop. It looks like it would try to boot into the OS, as I would see things like "Loading middlware, syncing disks" etc on the console. No beeping of any kind.

I figured there might be something up with the Boot drive, and I had been meaning to install the OS on a more suitable piece of hardware, anyway. So, I re-installed on an Intel Enterprise SSD.

The system dataset was on the data pool.

Booted the system up, re-installed TrueNAS and it booted up and I could access the UI. As soon as I imported the ZFS pool, though, it started the reboot loop again!!!

I started to suspect the pool data pool was the problem. So, I removed the drives from the box. Sure enough, boots up just fine, and I can access the UI. It survived multiple reboots without issue. I connected one of the drives from the data pool back. No problem, I can reboot the box with one of the drives connected.

So, if I boot the box up with two of the drives connected, though, it causes the reboot loop. It doesn't matter which two. If 2 of the 3 (or all 3) of the data drives are connected when I boot the box, it gets put into the reboot loop. If I boot the box up with one (or none) of the drives connected, it works fine.

Any idea what's going on here?
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
Try to capture the kernel panic message before it reboots.

Do you use dedupe feature? Or have you recently cloned any datasets or zvols?

I ran into this due to a ZFS Livelist bug that was patched, but I don't think has merged into TrueNAS yet. See my post for more info.
 

oguruma

Patron
Joined
Jan 2, 2016
Messages
226
Try to capture the kernel panic message before it reboots.

Do you use dedupe feature? Or have you recently cloned any datasets or zvols?

I ran into this due to a ZFS Livelist bug that was patched, but I don't think has merged into TrueNAS yet. See my post for more info.

Nope, don't use dedupe and haven't cloned any datasets/zvols.
 
Top