boot-pool uncorrectable I/O failure keeps occurring

joree11

Cadet
Joined
Oct 22, 2022
Messages
1
I realize that this topic has been covered a few times, but after having experienced this across 3 separate machines in the last few months I figured I should reach out and see if there is something I'm doing that is causing it.

First off all three machines are totally different hardware and cables with totally different boot drives and two are on a separate UPS to the other one.

1st Failure (July)
MacMini 2012, Boot Pool was created on a new Crucial CT950 SSD (1TB). I partitioned the SSD during install to allow me to make use of the rest of the 1TB drive. Within a week the device was unresponsive and after plugging into a monitor I saw the "boot-pool uncorrectable I/O failure" message. Popped the Drive out and into another machine checked the SMART (short and long), no issues.

2nd Failure (July)
Different MacMini 2012, same SSD, this time didn't partition it. Installed a separate 2nd HDD alongside it for Storage and Apps. Again boot-pool failed in a few days.

3rd Failure (Aug)
Abandoned the MacMini's and Crucial SSD, installed Ubuntu on the last one and is still running fine... Instead tried an new/unused (but purchased a while back) Samsung 850 Pro (512GB) in a DNK-H with Intel H110T and G4400T. Partitioned it again. Failed again, but after about 3 weeks.

4th Failure (Oct)
Custom server with Asus P105-M W5 and G4600T with 8 HDDs. Boot-pool was installed directly onto WD Black 250GB M2 NVMe drive. Lasted 2mo idle before failure. Rebooted and seemed fine again, was able to backup configs. Still running now, but not confident.


I do have another server that has been up and running fine for the last 4mo or so, its an 8bay device running a Biostar Hi-Fi B85N-3D with i7-4790T. The boot-pool is directly installed on a new Kingston SSDNow V300 III (120GB). It has been relatively active compared to all the other devices (which mainly sat idle).

I know the above is not really that useful without logs to debug with but I suppose my hope with this thread is just to figure out if something I'm doing should be avoided or if I've just been really unlucky?

Honestly, I would have preferred to run off of a USB Stick (I have ESX-i servers that do and have been fine for ~6years), but everything I read about TrueNAS suggested it was much more advisable to go with SSDs.

These are my main questions:
  1. In every case I've set the "System Dataset Pool" as the boot-pool. Is this generally ok?
  2. Despite me having issues with both, if you have an SSD is it generally ok to Partition it and use a portion as the boot-pool?
  3. Would Adding a USB Stick as a mirror to the boot-pool be worth it? what would happen when one fails? (most of my devices do not have spare SATAs I can use for this)
  4. What about altering the boot-pool to store 2 copies? would this offer any self-healing and prevent downtime?
  5. Is there anything I can do to reduce boot-pool reads/writes so that I could instead use 2x USB Sticks in mirror.
I guess there are two annoyances here, (A) the boot-pool is corrupting and (B) the server freezes when it happens. If I can resolve (B) at least, then I might be able to take action to back stuff up and fix things. But obviously solving (A) would be ideal.

My experience so far is that everything feels very fragile compared to what I'm used to with ESX-i or Ubuntu. I suspect its just because I'm using a solo drive for the boot-pool and ZFS is not very forgiving in that configuration. Any suggestions would be really appreciated.

Cheers,
Jo
 
Top