System dataset fails to mount on boot after upgrade to 22.12.2

dystrust

Cadet
Joined
Apr 24, 2023
Messages
2
This morning I upgraded both of my TrueNAS Scale machines to 22.12.2. My primary machine upgraded without issue, but reporting is broken on my on-site backup server. I tried manually restarting services as suggested in another thread here, which indicated that /etc/collectd/collectd.conf did not exist. I copied the file from my working machine, which allowed the service to start, however reporting was still broken. Further investigation showed that my system dataset wasn't mounted. It mounts if I manually select it via System Settings > Advanced > Storage, but the change doesn't survive a reboot. At approximately the same time I noticed that swap is also failing to mount at boot, and I haven't been able to resolve that thus far. Selecting the 22.21.1 boot environment likewise does not resolve the issue.

I see the following errors during boot, or via sudo journalctl -xe:

Failed to start Configure swap filesystem on boot pool.
Failed to start Import ZFS pools.

Hardware of affected system:
SCALE 22.12.2
Intel Xeon E3-1220L
SuperMicro X9SCM-F
8GB ECC
Boot: 2x Samsung MZ-7LN128HAHQ PM871B 128GB - Attached to MB SATA ports as dev/sda & /dev/sdb, mirrored
Storage: 2x Samsung HD204UI 2TB - Attached to HBA as dev/sde & /dev/sdf, mirrored
HBA: LSI 9207-8i

The boot pool drives have a 16GB swap partion each, but I don't believe they've been used; AFAIK the system swap partitions are on the storage drives. I also have a pair of 6TB WD Red Pros in the system that are currently unused.
 

dystrust

Cadet
Joined
Apr 24, 2023
Messages
2
SOLVED: My backup server now successfully mounts the system dataset and swap partitions on boot.

My WD Red Pros had previously been part of a MDADM mirror, and I hadn't realized this could be problematic. The boot/import process was finding them first and they failed to import because they haven't been set up for use with TrueNAS.

The two clues that lead to this discovery were:
  1. A failed job stating "mdadm: Cannot get exclusive access to /dev/md0"
  2. Running cat /proc/mdstat showed "md0 : active (auto-read-only)" with member disks /dev/sdc and /dev/sdd.​
Since these device IDs were assigned to the aforementioned unused Red Pros, I decided to disconnect them and try again. The system booted and mounted the system dataset and swap on the first attempt. Just to be sure, I'll wipe those drives before reinstalling them in the system.
 
Top