SubnetMask
Contributor
- Joined
- Jul 27, 2017
- Messages
- 129
First, let me say that no data was lost - this was a test instance and all pools were empty.
Yesterday, I decided to give TrueNAS another go, and installed TrueNAS 13U1.1 onto a Poweredge 510 that had 18 disks attached, some internal and some in an external Supermicro JBOD enclosure. That went as well as it could, still giving me the same VMFS6 issues related to extent block size. After that, I decided to move the setup from the R510 to a PE R620, and move the 3TB SAS disks internal to the R510 into another Supermicro enclosure. Both enclosures have dual expanders SAS2 backplanes, connected to dual HBAs on the host, etc. I didn't re-use the install from the R510, instead, I made a backup of the config, then I installed a fresh copy onto the pair of SAS disks internal to the R620, then once that was complete, I restored the config I had backed up to it and made the required NIC adjustments as the NIC setup is a little different than it was on the R510.
But what I noticed on boot after restoring the config were a bunch of disks were giving the messages:
GEOM: daXX: the secondary GPT table is corrupt or invalid.
GEOM: daXX: using the primary only -- recovery is suggested.
Looking in the UI, the pool is offline, and all of the disks are listed in 'disks', but with no pool association. I did not enable encryption on any of these pools, although when I did the backup of the config, I did select to export the secret seed with the backup.
While I've done pretty much the exact same thing before with TrueNAS installations - one in particular where the newer versions were randomly rebooting, but 11.1U7 was rock solid, so I had to do a clean install to get it back to stable operation, but I've never seen this before, and never failed to either restore a config and be back in business or import the pools manually and be back in business. This corruption of the GPT table is very concerning.
Any ideas what caused this, and how to prevent it? After thinking about it for a bit, I had a thought that maybe it was related to the prompt I got, which I answered yes to, to create 16GB swap partitions, but I don't really think that's it because it didn't ask me that when I installed it to the R510 on a 16GB USB stick, and all of the disks were attached during that install as well, and if that was it, I would think it would have done it to all disks, not just one pool of disks.
Yesterday, I decided to give TrueNAS another go, and installed TrueNAS 13U1.1 onto a Poweredge 510 that had 18 disks attached, some internal and some in an external Supermicro JBOD enclosure. That went as well as it could, still giving me the same VMFS6 issues related to extent block size. After that, I decided to move the setup from the R510 to a PE R620, and move the 3TB SAS disks internal to the R510 into another Supermicro enclosure. Both enclosures have dual expanders SAS2 backplanes, connected to dual HBAs on the host, etc. I didn't re-use the install from the R510, instead, I made a backup of the config, then I installed a fresh copy onto the pair of SAS disks internal to the R620, then once that was complete, I restored the config I had backed up to it and made the required NIC adjustments as the NIC setup is a little different than it was on the R510.
But what I noticed on boot after restoring the config were a bunch of disks were giving the messages:
GEOM: daXX: the secondary GPT table is corrupt or invalid.
GEOM: daXX: using the primary only -- recovery is suggested.
Looking in the UI, the pool is offline, and all of the disks are listed in 'disks', but with no pool association. I did not enable encryption on any of these pools, although when I did the backup of the config, I did select to export the secret seed with the backup.
While I've done pretty much the exact same thing before with TrueNAS installations - one in particular where the newer versions were randomly rebooting, but 11.1U7 was rock solid, so I had to do a clean install to get it back to stable operation, but I've never seen this before, and never failed to either restore a config and be back in business or import the pools manually and be back in business. This corruption of the GPT table is very concerning.
Any ideas what caused this, and how to prevent it? After thinking about it for a bit, I had a thought that maybe it was related to the prompt I got, which I answered yes to, to create 16GB swap partitions, but I don't really think that's it because it didn't ask me that when I installed it to the R510 on a 16GB USB stick, and all of the disks were attached during that install as well, and if that was it, I would think it would have done it to all disks, not just one pool of disks.