Boot hangs with data pool installed

swildig

Cadet
Joined
Feb 27, 2024
Messages
2
Hi,

I've got a truenas scale system that hard locked then is unable to boot. It's hanging on boot when starting the "ix-zfs.service" or when running the "Import ZFS pools" job, either way I get a call trace and a error saying a process was blocked for too long (normally middlewared) and the job running timer stops counting up.

I've tried swapping drives to new chassis, new HBA etc with no luck. It does however boot with the data drives removed so I'm guessing there's drive misbehaving in the data pool.

So my question is how to I find that bad drive, I want to remove chunks drives until it boots to find it, but what will happen if it does boot and truenas sees the missing drives? I don't want to get to a state where it fails out the pool and won't detect the drives when I add them back. The other option that I was thinking was to booting with no drives, then slowly add them but again I'm wondering what that will do when the pool it detected missing most of the drives.

Is there a maintenance mode or anything where it won't make any changes to the state of the pool whilst I test the drives?

Thanks in advance,
Sam
 

swildig

Cadet
Joined
Feb 27, 2024
Messages
2
It's booting to recovery mode okay now, I can manually import the pool and it shows as healthy and online. But 1 min or so after importing the pool I get a call trace and a error that I can't read before the whole machine reboots.

I can't find any log files with the error, any ideas how I can pin down what's causing the issue?
 
Top