Register for the iXsystems Community to get an ad-free experience

Migrating to Scale breaks the boot pool

Western Digital Drives - The Preferred Drives of FreeNAS and TrueNAS CORE

ctag

Contributor
Joined
Jun 16, 2017
Messages
188
I'm now encountering an unexpected rough edge of migrating from Core to Scale: The boot environment cannot be healed after losing a disk in the 2-disk mirror pool. At first I thought this was a more general issue of disk size discrepancies that I've whined about previously. Instead it wound up being that same problem (not enough space on an identical disk to resilver) but created intentionally.


Here is my remaining, working boot disk:

Code:
root@bns-citadel:~# fdisk -l /dev/sdm
Disk /dev/sdm: 111.79 GiB, 120034123776 bytes, 234441648 sectors
Disk model: CT120BX500SSD1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: F51A048E-C72D-11EB-9F7B-F04DA2301444

Device     Start       End   Sectors   Size Type
/dev/sdm1     40      1063      1024   512K BIOS boot
/dev/sdm2   1064 234441607 234440544 111.8G FreeBSD ZFS


And this is what Truenas attempted to do with the new disk being swapped in to replace a failed one:

Code:
root@bns-citadel:~# fdisk -l /dev/sdn
Disk /dev/sdn: 111.79 GiB, 120034123776 bytes, 234441648 sectors
Disk model: CT120BX500SSD1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E8A71C3D-4A58-45FB-91EB-EB0FB52432BA

Device       Start       End   Sectors   Size Type
/dev/sdn1       40      2087      2048     1M BIOS boot
/dev/sdn2     2088   1050663   1048576   512M EFI System
/dev/sdn3  1050664 234441614 233390951 111.3G Solaris /usr & Apple ZFS



So the system, post upgrade, will not be able to recover a boot mirror because it is attempting to create a partition scheme not used previously.
 
Last edited:

Kris Moore

SVP of Engineering
Administrator
Moderator
iXsystems
Joined
Nov 12, 2015
Messages
1,190
That sounds like a bug that needs to be addressed so that mirroring / partitioning works properly in this case. Can you open a ticket for us please?

 

ctag

Contributor
Joined
Jun 16, 2017
Messages
188
I think I've shot myself in the foot here. Before the suggestion to create a bug report (where I'd then keep the system the same so I can maybe help test any solution), I was thinking I'd swap in a larger drive and let the resilver continue to completion.

But before that could happen, I shut down, swapped in the larger drive, and on power-up the system hung. After waiting around for a while, I rebooted with the physical power button, and then when that didn't work I swapped the smaller replacement drive back in. The system still didn't come back, and I finally plugged in a monitor. This GRUB error is persistent across all of the Scale boot options I tried: "error: symbol 'grub_register_command_lockdown' not found"

I'm not sure if this is related to me futzing with the boot drives, or the recent update to 22.02.3, but right now the system is off and won't boot up. I have a backup of the config to use if I need to reinstall, but I'd like to figure out what went wrong here too.
 

Attachments

  • IMG_20220817_101916.jpg
    IMG_20220817_101916.jpg
    375.6 KB · Views: 41

ctag

Contributor
Joined
Jun 16, 2017
Messages
188
Reinstalled the previously removed, failing drive and the system booted successfully from that. Truenas claims to be loaded and running from the good drive though, with the removed one still removed.
 
Top