Reboot produces GPT Table Corrupt or Invalid

cyberjock · Jan 5, 2014

Can you post your bare metal hardware? I don't see it listed on this thread.

I will say you are the only person I've ever seen to have this issue, and it has definitely peaked my interest.

Todd Marimon · Jan 6, 2014

Tyan S7012 dual socket LGA1366 Server Motherboard
1x Xeon L5520 2.26GHz Quad Core CPU
24GB (6x4GB) DDR3 ECC-Reg
Dual 720W PSU
LSI9211-IT (R14) HBA
Chenbro RM23212 2U 12-bay (3x4 port SAS backplanes, 8 ports connected via SAS cables to the LSI, the remaining 4 to be connected once I get another LSI card)
Currently playing with 2xSamsung HD753LJ 750GB SATA drives.
Ultimately, I want to put in 6x3TB WD Reds for RAIDZ2-- haven't been purchased yet

cyberjock · Jan 6, 2014

Ok, are the hard drives connected directly to the HBA or are they connected through the Chenbro?

I mean, the 9211 is very well supported on FreeBSD. I'm really confused so I'm trying to figure out where you either did something wrong, are using hardware that isn't reliable, or something else of the sort.

While I'm not a fan of Tyan, I don't expect that to be the problem(at least, not because of the brand itself).

Just for curiosity, have you tried booting a linux live CD(or any OS except FreeNAS/FreeBSD), writing a partition table to the disks, then rebooting to see what happens?

Todd Marimon · Jan 6, 2014

They are connected via the Chenbro. It isn't a "SAS expander"... so I'm fairly sure it is just the same as having a spider cable attach the drives. Maybe I'm wrong about that. The backplanes aren't detected as devices in the chain, anyway. I'd suspect reliability type issues if these were a problem-- poor comm to the drives, etc... I wouldn't expect a very consistently reproducible problem.

I have not yet tried another OS. I was going to try ubuntu next. Honestly, I suspect that will work fine. I feel like this is something FreeBSD is doing (but I'm not a BSD expert).

Maybe I just have bad BSD karma, and this is certainly unrelated... but I recall having issues when I was installing PFSense initially-- and I still encounter the same type of problems when I install it these days. I find that before one installs PFSense, you HAVE to zero out the drives because anything that is on them before somehow messes with the partitioning tools. So, let's just say I always have issues with drives in FreeBSD OSes. Never had such issues in the linux's, tho.

cyberjock · Jan 6, 2014

I had problems with pfsense initially. The installer didn't like the fact that I had a 30GB drive. Apparently if you use anything smaller than 32GB it screws stuff up. That was my luck.

cyberjock · Jan 6, 2014

And I'm 99% sure it is a SAS expander. Otherwise your SAS cable would run only 4 drives. ;)

Todd Marimon · Jan 6, 2014

My SAS cables _are_ only running 4 drives each. The backplane is 3 separate PCBs with 4 mounting sockets each with a "mini" SAS connector in the back of each of them. I have 3 SAS cables in the case. One of them is not connected because I still need another LSI HBA (since I was starting with only a handful of drives for now, I didn't invest in both of them yet).

The case came with a separate SAS expander board, but it only ran drives at SATAI. Not that these platter disks can really utilize that anyway, but in aggregation, they can. That's why I'm connecting direct to HBAs.

cyberjock · Jan 6, 2014

Doh.. found an internal picture(finally) and it's actually the whole computer. I thought it was simply a shallow case that provided power for hard drives and that's it.

So now things are messy. I'll have to think for a minute or two about this...

Todd Marimon · Jan 6, 2014

It is a single case-- holds the server and the backplanes. No external SAS cables, if that is what you are asking. Only internal "mini" SAS cables. Only 2 of the 3 backplanes are currently hooked up 1:1 to the HBA ports. I can only use 8 drives currently.

cyberjock · Jan 6, 2014

Ok. So let's try something. Make your zpool and MD5 the partition tables like you did before. But, before shutting down the server go to the CLI and do zpool export yourpoolname. Then physically pull the drives from the server. Do a reboot of the server like you have been. Then when the server has booted back up put the disks back in the machine. Then do a zpool import -R /mnt yourpoolname and check the MD5s.

If that works, then we've kind-of narrowed down the problem to something happening at shutdown or at bootup(which I think we are kind of expecting right now). So now leave the disks in and do a shutdown. After the system powers off pull the drives, then bootup the server, then add your drives and do the same zpool import command again. Check your MD5s.

If that works, then export your pool again, do a shutdown with the disks physically removed, then plug the drives in and bootup.

This is to figure out if the boot and/or shutdown are responsible for the corruption. I'm wondering if the process of powering down the system is causing a voltage spike or something that is causing data to be written to the backup GPT. The backup GPT is one of the sectors closest to where the heads park, so naturally if something is going wrong during shutdown and the heads are writing data to the media, the backup GPT is the one that you'd expect to get trashed.

One thing that is weird to me. Your problem on bare metal has a striking similarity to how ESXi users see their pools die when they do RDM. It could be random chance, but there may be a correlation. I'm not sure which(I tend to think random chance) but I'm not ready to make that assumption just yet.

Todd Marimon · Jan 6, 2014

OK, well, interesting tidbit here...

I've installed Ubuntu (and played with the LiveCD before) and I've found that it sees the GPT partitions that FreeNAS created prior.

I'm currently zeroing out the drives completely. After that I'll create new GPT (might as well try to replicate this as close as possible, right?) create some file systems, mount it up, then reboot and see if they stay. Honestly-- I suspect they will.

What this is indicating to me is that for some reason when FreeBSD is looking at these drives after a reboot, it isn't reading them the "same way". I come to this conclusion because other OSes don't have a problem reading the GPT.

cyberjock · Jan 6, 2014

I beat you to a post by less than 2 seconds. Haha. Scroll up and give what I wrote a read.

Well, it should see the GPT partitions. The primary is good but the backup is not according to your first post. The real question is if the ZFS data is still valid. You'd have to install zfs on linux to find out(assuming ZFS on linux handles v5000 pools).

cyberjock · Jan 6, 2014

Stupid thing we never thought of.. can you run the following command and provide the output in pastebin or something....

smartctl -a /dev/XXX where you use your drive device names? Just going to check your hard drives.

cyberjock · Jan 6, 2014

One thing you might hate that I'm about to say. Quite often servers provide a list of OSes supported. Generally the list is fairly complete. That is, if FreeBSD isn't on the list and people are having problems it tends to make sense. If you look up your motherboard compatibility matix, http://www.tyan.com/tech/OS_Support_IntelWins.aspx , you'll find that FreeBSD isn't on their list, now every time I've tried to help someone with something we can't solve the OS matrix never showed FreeBSD as supported. I don't normally trust those matrixes because they may just not test them. But, this does make me wonder if there's some kind of incompatibility at work...

Todd Marimon · Jan 6, 2014

cyberjock said:
Ok. So let's try something. Make your zpool and MD5 the partition tables like you did before. But, before shutting down the server go to the CLI and do zpool export yourpoolname. Then physically pull the drives from the server. Do a reboot of the server like you have been. Then when the server has booted back up put the disks back in the machine. Then do a zpool import -R /mnt yourpoolname and check the MD5s.

If that works, then we've kind-of narrowed down the problem to something happening at shutdown or at bootup(which I think we are kind of expecting right now). So now leave the disks in and do a shutdown. After the system powers off pull the drives, then bootup the server, then add your drives and do the same zpool import command again. Check your MD5s.

If that works, then export your pool again, do a shutdown with the disks physically removed, then plug the drives in and bootup.

This is to figure out if the boot and/or shutdown are responsible for the corruption. I'm wondering if the process of powering down the system is causing a voltage spike or something that is causing data to be written to the backup GPT. The backup GPT is one of the sectors closest to where the heads park, so naturally if something is going wrong during shutdown and the heads are writing data to the media, the backup GPT is the one that you'd expect to get trashed.

One thing that is weird to me. Your problem on bare metal has a striking similarity to how ESXi users see their pools die when they do RDM. It could be random chance, but there may be a correlation. I'm not sure which(I tend to think random chance) but I'm not ready to make that assumption just yet.

Opps, sorry, I didn't see this post in time. If the problem keeps happening in FreeNAS, then I'll carry out your procedure.

I think the voltage spike part is interesting. However, when I "rebooted" in ESXi... I was only rebooting the VM (not the host), and I don't think it powers-off the drives. I might be wrong and just missing that the boot of the FreeNAS VM also spins the drives up, but I don't think I noticed them spinning back up or down.

Why does FreeBSD get so upset when the backup GPT is corrupt? Yes, it is logging this, but it is also proceeding to fail to find partitions on the drive.

I think when I carry out your procedure, I will instead actually capture the bytes themselves so I can do a binary diff and see what bytes differ. It would be really interesting to see if random bits actually flip, or it is clearly something completely wrong.

Theory time:

What if FreeBSD is saving a back-up of the GPT as it sees it at start-up. Then when a new GPT is created for the partition, it is not changing this back-up it saw at start-up. So when it shuts down, it sees the data on the drive has changed and it writes a new one from the original back-up at start-up. This could create a situation where both primary and secondary GPTs are valid (hash correctly), but because they are both different it doesn't know which one to trust-- and therefore trusts neither?

Just a theory-- I don't have any working knowledge of how these things are dealt with.

Todd Marimon · Jan 6, 2014

cyberjock said:
One thing you might hate that I'm about to say. Quite often servers provide a list of OSes supported. Generally the list is fairly complete. That is, if FreeBSD isn't on the list and people are having problems it tends to make sense. If you look up your motherboard compatibility matix, http://www.tyan.com/tech/OS_Support_IntelWins.aspx , you'll find that FreeBSD isn't on their list, now every time I've tried to help someone with something we can't solve the OS matrix never showed FreeBSD as supported. I don't normally trust those matrixes because they may just not test them. But, this does make me wonder if there's some kind of incompatibility at work...

This is possible. Good eye. Like you, I never trusts those things either. I worked in software for several years and I know that 100% of things cannot be tested. They probably only test the primary OS (Windows and a few Linuxes).

Todd Marimon · Jan 6, 2014

cyberjock said:
Stupid thing we never thought of.. can you run the following command and provide the output in pastebin or something....

smartctl -a /dev/XXX where you use your drive device names? Just going to check your hard drives.

Code:

todd@ubuntu-test:~$ sudo smartctl -a /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
 
=== START OF INFORMATION SECTION ===
Model Family:    SAMSUNG SpinPoint F1 DT
Device Model:    SAMSUNG HD753LJ
Serial Number:    S13UJDWQ601618
LU WWN Device Id: 5 0000f0 003066181
Firmware Version: 1AA01112
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Mon Jan  6 22:34:10 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
 
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (10939) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  2) minutes.
Extended self-test routine
recommended polling time:        ( 183) minutes.
Conveyance self-test routine
recommended polling time:        (  20) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.
 
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000f  100  099  051    Pre-fail  Always      -      0
  3 Spin_Up_Time            0x0007  065  065  011    Pre-fail  Always      -      11290
  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      101
  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000f  100  100  051    Pre-fail  Always      -      0
  8 Seek_Time_Performance  0x0025  100  100  015    Pre-fail  Offline      -      9971
  9 Power_On_Hours          0x0032  097  097  000    Old_age  Always      -      14018
10 Spin_Retry_Count        0x0033  100  100  051    Pre-fail  Always      -      0
11 Calibration_Retry_Count 0x0012  100  100  000    Old_age  Always      -      2
12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      66
13 Read_Soft_Error_Rate    0x000e  100  099  000    Old_age  Always      -      0
183 Runtime_Bad_Block      0x0032  100  100  000    Old_age  Always      -      0
184 End-to-End_Error        0x0033  100  100  099    Pre-fail  Always      -      0
187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      85
188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0
190 Airflow_Temperature_Cel 0x0022  084  065  000    Old_age  Always      -      16 (Min/Max 11/16)
194 Temperature_Celsius    0x0022  078  064  000    Old_age  Always      -      22 (Min/Max 11/22)
195 Hardware_ECC_Recovered  0x001a  100  100  000    Old_age  Always      -      2099
196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0030  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x003e  100  100  000    Old_age  Always      -      0
200 Multi_Zone_Error_Rate  0x000a  100  100  000    Old_age  Always      -      2
201 Soft_Read_Error_Rate    0x000a  253  253  000    Old_age  Always      -      0
 
SMART Error Log Version: 1
No Errors Logged
 
SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision number = 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline      Completed without error      00%    13880        -
# 2  Short offline      Aborted by host              00%    13879        -
# 3  Short offline      Aborted by host              00%    13878        -
# 4  Short offline      Aborted by host              00%    13877        -
# 5  Short offline      Aborted by host              00%    13876        -
# 6  Short offline      Aborted by host              00%    13875        -
# 7  Short offline      Aborted by host              00%    13874        -
# 8  Short offline      Aborted by host              00%    13873        -
# 9  Short offline      Aborted by host              00%    13872        -
 
Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Code:

todd@ubuntu-test:~$ sudo smartctl -a /dev/sdc
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
 
=== START OF INFORMATION SECTION ===
Model Family:    SAMSUNG SpinPoint F1 DT
Device Model:    SAMSUNG HD753LJ
Serial Number:    S13UJDWQ601619
LU WWN Device Id: 5 0000f0 003066191
Firmware Version: 1AA01112
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Mon Jan  6 22:34:19 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
 
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (11772) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  2) minutes.
Extended self-test routine
recommended polling time:        ( 197) minutes.
Conveyance self-test routine
recommended polling time:        (  21) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.
 
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000f  099  099  051    Pre-fail  Always      -      218
  3 Spin_Up_Time            0x0007  067  067  011    Pre-fail  Always      -      10800
  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      107
  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000f  100  100  051    Pre-fail  Always      -      0
  8 Seek_Time_Performance  0x0025  100  100  015    Pre-fail  Offline      -      9851
  9 Power_On_Hours          0x0032  097  097  000    Old_age  Always      -      14004
10 Spin_Retry_Count        0x0033  100  100  051    Pre-fail  Always      -      0
11 Calibration_Retry_Count 0x0012  100  100  000    Old_age  Always      -      3
12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      73
13 Read_Soft_Error_Rate    0x000e  099  099  000    Old_age  Always      -      218
183 Runtime_Bad_Block      0x0032  100  100  000    Old_age  Always      -      0
184 End-to-End_Error        0x0033  100  100  099    Pre-fail  Always      -      0
187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      218
188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0
190 Airflow_Temperature_Cel 0x0022  085  066  000    Old_age  Always      -      15 (Min/Max 10/15)
194 Temperature_Celsius    0x0022  079  066  000    Old_age  Always      -      21 (Min/Max 10/21)
195 Hardware_ECC_Recovered  0x001a  100  100  000    Old_age  Always      -      21385
196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0030  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x003e  100  100  000    Old_age  Always      -      0
200 Multi_Zone_Error_Rate  0x000a  100  100  000    Old_age  Always      -      0
201 Soft_Read_Error_Rate    0x000a  253  253  000    Old_age  Always      -      0
 
SMART Error Log Version: 1
No Errors Logged
 
SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision number = 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline      Completed without error      00%    13866        -
# 2  Short offline      Aborted by host              00%    13865        -
# 3  Short offline      Aborted by host              00%    13864        -
# 4  Short offline      Aborted by host              00%    13863        -
# 5  Short offline      Aborted by host              00%    13862        -
# 6  Short offline      Aborted by host              00%    13861        -
# 7  Short offline      Aborted by host              00%    13860        -
# 8  Short offline      Aborted by host              00%    13859        -
# 9  Short offline      Aborted by host              00%    13858        -
 
Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

cyberjock · Jan 6, 2014

Well, but drives are showing non-zero values for 187 and 195. That worries me somewhat. It may show that there is a problem, but I'm along way from believeing it is likely.

You are right, if you were rebooting the VM and seeing the disks being trashed that's not likely. But, VMs have their own special kind of "fail" so it's not out of the woods. Unfortunately, whatever the cause is probably falls under the "not likely" category since I've never seen this kind of problem in almost 2 years here.

Todd Marimon · Jan 6, 2014

cyberjock said:
Well, but drives are showing non-zero values for 187 and 195. That worries me somewhat. It may show that there is a problem, but I'm along way from believeing it is likely.

You are right, if you were rebooting the VM and seeing the disks being trashed that's not likely. But, VMs have their own special kind of "fail" so it's not out of the woods. Unfortunately, whatever the cause is probably falls under the "not likely" category since I've never seen this kind of problem in almost 2 years here.

Yea, as you can see, these drives have had some use-- They are not new.

Important Announcement for the TrueNAS Community.

Reboot produces GPT Table Corrupt or Invalid

Inactive Account

Dabbler

Inactive Account

Dabbler

Inactive Account

Inactive Account

Dabbler

Inactive Account

Dabbler

Inactive Account

Dabbler

Inactive Account

Inactive Account

Inactive Account

Dabbler

Dabbler

Dabbler

Inactive Account

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Reboot produces GPT Table Corrupt or Invalid"

Similar threads