Unable to boot with data pool drives attached.

StephenW · Feb 9, 2022

I have a repurposed PC that I am running TrueNAS Core 12 on as a media server.
Mobo - Pegatron IPMMB-FM with latest BIOS (2014)
CPU - Core i7 3770
RAM - 4 x 4 GB Mushkin Redline DDR3 non ECC
HBA - IBM M1015 9220-8i flashed with IT firmware
Boot pool - 2 x Kingston SSD 200GB connected to the Mobo SATA ports in a mirror.
Data pool - 6+1 x WD Red Plus CMR 4 TB HDD in two RAIDZ1 vdevs plus one hot spare all connected to the IBM HBA.

So I'll start out by saying I'm an idiot and I replaced a failing drive in one of my vdevs without following the proper procedure because I was in a hurry that day and just plain forgot. When my box didn't reboot I knew immediately what I had done wrong and powered it off and returned the failing drive to the machine. Feel free to point and giggle or send dunce cap emojis.

With the original drive replaced it still wouldn't reboot so I plugged in a monitor and keyboard and it gives me the same error every time.

good
not supported
ZFS can only boot from drive, mirror, raidz1, raidz2, raidz3
better
not supported
done
ZFS found the following pools: Gallifrey boot-pool
UFS found no partition
Consoles: EFI consoles
ZFS can only boot from drive, mirror, raidz1, raidz2, raidz3

The bolded items only appear once, the other lines appear multiple times. I have attached a photo of the screen with the errors below.

If I disconnect the cables from the IBM HBA the system boots. If I plug them back in it won't.

I have gone into BIOS and disabled every device except the SSDs from the boot order. I have entered the Setup menu and used the boot menu to tell it to boot from one or the other of the SSDs. Nothing matters. If the data pool drives are attached the same errors come up.

It seems as if the IBM HBA is somehow able to override the boot order but I'm not sure how or why.

I was prepared to blow away my data pool and start over as I have backups of everything that matters and can re-rip the video collection if need be but I can't boot if the data pool drives are connected.

Short of pulling every drive and putting it in my HDD dock and formatting it I'm not sure what else to try. I am not even convinced that doing that will work.

I would greatly appreciate any insights you may have.

Thank-you for taking the time to read my post.

AJCxZ0 · Feb 20, 2022

I've experienced a very similar, if not identical issue, which is not yet resolved. Details in Boot fails with "ZFS: can only boot from disks, mirror, raidz1, raidz2 and raidz3 vdevs" after BOOTX64.EFI.

StephenW, how did you resolve this?

StephenW · Mar 19, 2022

It is still unresolved.

StephenW · Apr 10, 2022

So with nothing left to lose I tried a very sketchy solution. I unplugged the data pool drives from the HBA and booted. I logged into the web interface and then plugged the SATA cables into the HBA. My drives appeared. I then had to delete my pool and then add it. My pool reappeared intact. It is busy re-silvering now. Once it has completed I will offline the defective drive and add the replacement. Hopefully once that is done I'll have a working NAS again.

StephenW · Apr 16, 2022

So the pool is fully restored to 2 RaidZ1 vdevs and a hot spare. No data was lost. Next is to see what happens when I have to reboot.

AJCxZ0 · Apr 16, 2022

StephenW said:
So the pool is fully restored to 2 RaidZ1 vdevs and a hot spare. No data was lost. Next is to see what happens when I have to reboot.

That's great news. Thank you for sharing it.

While an update is long overdue to my thread, I was able to get get my datasets and services back online with no data loss after booting with all data drives removed, leaving only the OS drives, then inserting the data drives, importing the zpool, mounting the ZFS and starting the services. Since I strongly suspect that the system remains unbootable (with uncertainly only due a zfs clean step in the process clearing an error), the effort to migrate all data off the system and rebuild it (this time with with a Hybrid pool) was underway when I had to step away for a long while.

StephenW · Apr 18, 2022

So after running fine for 72 hours I performed and upgrade to the TrueNAS software and it rebooted fine except that another 4TB WD Red seems to have failed. It is no longer visible in the list of drives but it was there prior to the reboot and seemed to be working fine. One of my two Vdevs is degraded and currently resilvering to the spare drive. It seems a bit like progress. At least it reboots properly now. I have RMA'd the first drive to fail and once I get the replacement back I will replace the one that seems to have just failed. I'll do it properly this time. If that works then the only other thing to fix is that TrueNAS sends me alerts that the first drive to fail is not connected.

Important Announcement for the TrueNAS Community.

Unable to boot with data pool drives attached.

StephenW

Cadet

Attachments

AJCxZ0

Dabbler

StephenW

Cadet

StephenW

Cadet

StephenW

Cadet

AJCxZ0

Dabbler

StephenW

Cadet

Similar threads