Pool gone after reset, import does not help

Thomas · Aug 20, 2013

Now here I recognize some of the hex's you posted :P

Code:

[root@freenas] ~# cat /dev/ada0 | dd bs=1024 skip=1900000 | od -A x -x | grep 7a11 | grep b10c | head
c09bfd0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b7fd0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b83d0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b87d0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b8bd0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b8fd0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b93d0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b97d0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b9bd0      0000    0000    0000    0000    7a11    b10c    da7a    0210
c0b9fd0      0000    0000    0000    0000    7a11    b10c    da7a    0210
[root@freenas] ~#

And without the grep:

Code:

[root@freenas] ~# cat /dev/ada0 | dd bs=1024 skip=1900000 | od -A x -x | grep 7a11 | head
0064bc0      b9aa    9953    ec1f    cfc5    7a11    f7b6    38d5    2330
007a110      a744    5abb    544e    b8f2    6886    f59a    b202    ccb0
0097120      4d36    117a    fa0a    5b76    7a11    3696    8b35    43b4
0099e80      7a11    8758    41d0    d5b1    5143    e502    7949    1aa2
00afc40      93c8    f0e0    1700    a9c0    7a11    33a0    3bdb    1979
01171c0      d136    a859    9773    35f7    7ef2    5c4f    7a11    d47c
017a110      62a5    4a1c    0679    79eb    7eaa    12df    5be3    2eca
01b2420      07ca    fb68    7a11    2e30    7d66    ddfc    c9eb    5bff
01b9af0      a36c    96be    44c7    4e23    7763    7a11    7657    4ace
01c3f90      3d3a    7a11    531b    32a7    9d6b    136a    e32a    f447
[root@freenas] ~#

FlynnVT · Aug 20, 2013

Cool! Looks like there's still a ZFS header at 2GB and you may simply have a corrupt disk label:

A) Search offset 0xc09bfd0 = 201,965,520

B) DD skip offset 1,900,000 * 1024 = 1,945,600,000

A+B = 2,147,565,520 / 1024 / 1024 = 2GB

The next bit isn't rocket science and almost risk free, but is definitely a bit fiddly, manual and undefined. Do this on bare metal in case your ESXi system is indeed responsible for the original corruption.

Unfortunately, I have never actually been through this under these exact circumstances, don't have a live FreeNAS machine to try things on and haven't memorized the syntax for BSD's partitioning tools. I should have some time to look at it tonight if you can wait.

Essentially, you need to create an MBR or GPT partition table that will make the entire ZFS area appear as one perfectly framed block device. From there, everything should auto-import if there is no other corruption.

These are roughly the steps I'd take. Don't attempt this unless you understand what's happening at each stage:

Backup parts of each of the 3 discs to some other independent and working device. I think FreeNAS uses GPT labels by default? These have 17408 bytes at the head, but also 16896 bytes at the tail of the disc. (http://en.wikipedia.org/wiki/GUID_Partition_Table), so you should ideally backup that too.
Code:
```
dd if=/dev/ada0 of=/mnt/working_place/backups/ada0_gpt_head.dat bs=512 count=34
```
Code:
```
diskinfo -v ada0 | grep "mediasize in sectors"
dd if=/dev/ada0 of=/mnt/working_place/backups/ada0_gpt_tail.dat bs=512 count=32 seek=<the number from diskinfo, minus 34>
```
(That way you can always undo things later by dd'ing those backups back onto the disc. Don't mix up the "if" and "of" parameters!)

(Come to think of it, the secondary GPT may be defeating the secondary "recovery" ZFS structure in this situation: ZFS would have checked END-256 and END-512 for a ZFS header on ada0, but found offset and/or GPT data instead)
Then, create that fresh partition table and reboot. I'd go with GPT (if that's what FreeNAS uses?), rather than MBR as it simplifies setting the end sector for the ZFS partition.

I'd have to experiment with FreeNAS in a new VM somewhere else to see what the default layout is. Essentially, I'd want to see the new ZFS block device on physical disc having the same "cat | od -c | head" layout and offset as a new dummy VM ZFS FS that I'd create.

Now, it could be as simple as: (don't try these commands)
Code:
```
gpart create -s gpt ada0
gpart add -t freebsd-swap -s 2048M ada0
gpart add -t freebsd-zfs ada0
```
...but I don't know the details without playing around first. (e.g. Do the FreeBSD tools format or otherwise zero partitions, or just operate on the label areas? Do you have to reboot the machine to cause a re-scan of the discs, or can you initiate it manually, like in Linux)

This page has a good introduction to label creation: http://www.wonkity.com/~wblock/docs/html/disksetup.html

Really, don't do anything that you don't understand or take any guesses. One mis-typed or out-of-sequence command operating on the disc could kill whatever chances you have for recovery.

FlynnVT · Aug 20, 2013

One other potential option: the secondary GPT at the tail of the disc may be completely intact too. The solution could be as simple as dd'ing 16k from the end of the disc to the correct offset at the start. Again, I'd need time to check things out to be certain. I don't know if the secondary copy is actively used by FreeBSD or simply put to one side to give manual recovery options?

Thomas · Aug 20, 2013

Awesome, that sounds promising! Thank you! I would greatly appreciate it if you would look into the matter and tell me exactly what I should do. I have too little experience with dd and sectors and stuff, so I'm not going to try anything now. I'll take the server home with me tomorrow where I have a bunch of identical HDD's laying around. I'll backup the drives then. Can I just copy the drive entirly or would it be wiser to use the specific datarange you mentioned? (dd if=/dev/ada0 of=/mnt/working_place/backups/ada0_gpt_head.dat bs=512 count=34 and tail of course.)

FlynnVT · Aug 20, 2013

The count=34 dd's would only grab about 16KB off the discs - just the areas that the GPT commands should touch. These backups would only be to undo anything accidentally done with gpart or dd'ing to the GPT areas. "/mnt/working_place" could be a USB stick or another UFS/ZFS volume attached to FreeNAS. I'm not sure what you're comfortable setting up here. I'd want it to be a separate, stable and removable device.

If you have many identical spare discs then things may get much simpler! (Physically label the disks carefully!) You could take a full copy of a disc to another before trying anything: "dd if=/dev/<source> of=/dev/<dest> bs=65536" ...this command is a disaster if you get the source and dest swapped around though!

I'm not sure I'd trust myself to do it without rechecking so many things. In all cases, the "if"/"of" directions and specified devices are critical. One mistake and you go the wrong way.

Another option: use one of the identical spare blank discs to create a new separate single-disk FreeNAS ZFS pool. Hopefully, this will cause FreeNAS to create an identical partition table as was once on the now damaged discs - saves waiting for me to figure out the specifics and commands.

Then, grab the good partition table from this disc:

Code:

dd if=/dev/<whatever> of=/mnt/working_place/backups/fresh_gpt_head.dat bs=512 count=34

Code:

diskinfo -v ada0 | grep "mediasize in sectors"
dd if=/dev/<whatever> of=/mnt/working_place/backups/fresh_gpt_tail.dat bs=512 count=32 seek=<the number from diskinfo, minus 34>

Dropping these "good" GPT tables onto the "bad" disc is one more simple dd operation. You can even combine them into one command if both discs are simultaneously attached to the machine. You could try it on one disc and see if ZFS picks up on a vdev in a pool. This is where my knowledge begins to run into theory that I'd have to practise first. Certainly, you wouldn't want any ZFS recovery or scrub to kick in with only a single disc attached.

Changing course a bit, can you see if there's any mention of "corrupt" in the kernel messages after booting with the "bad" discs?

If the head GPT was wiped, but tail remains, then FreeBSD won't actually scan the device according to this:
http://howtounix.info/man/FreeBSD/man8/gpart.8#x5245434f564552494e47

This is particularly heartening: "If the first sector of a provider is corrupt, the kernel can not detect GPT even if partition table itself is not corrupt. You can rewrite the protective MBR using the dd(1) command, to restore the ability to detect the GPT.".

Based on that, if the tail GPT was still intact, restoring may be as simple as (for each disc):

"gpart bootcode -b /boot/pmbr-datadisk /dev/<whatever>"
(or)
"dd if=/boot/pmbr-datadisk of=/dev/whatever bs=512 count=1"
Reboot and check dmesg for "corrupt" primary, secondary OK type messages
Have gpart rewrite the head GPT from the secondary "gpart recover /dev/whatever"
Reboot, done!

Again, I may have time to try this out in a VM tonight. Someone with FreeBSD-specific experience may be able to confirm or add detail in the meantime. There's little risk with this approach: the first 512bytes are already junk on your discs, so dd'ing a PMBR over it isn't risking anything. Backup the 16k head first if possible though. Keep your options open.

Aside from all of this, it seems that something did indeed clobber at least the start of all three of your discs. Fingers crossed, the upper sectors are OK. ESXi-related or not, I'm personally going to continue using a VM for FreeNAS at home. My backups (NTFS, over a network) are kept recent and the whole thing has been entirely stable to date.

I put that last bit in bold, to save cyberjock from needing to quote it back to me! ;-)

Edit: I want to add that my ESXi VM is set to only have 1 CPU core. I once read somewhere that FreeBSD was prone to kernel panics with multi-CPU/core operation in ESXi or VMs in general. Given that CIFS is single-threaded, I didn't think the performance risk was worth talking.

Also, my ZFS discs are all "whole disk" volumes: no GPT/MBR headers. That means that ZFS will still find it's two copy headers at the tail of the disc even if the first few bytes of the disc were clobbered. Who knows - maybe my ESXi installation is suffering from this silent head-sector corruption problem too, but automatically heals itself because of not being reliant on a partition map and not being defeated by being offset by GPT blobs at the head and tail of the raw block device. It also means I don't have any swap partition on the ZFS data discs.

Either way, it's working well for me.

cyberjock · Aug 20, 2013

FlynnVT, you seem to have more experience with the command line than I.

Just thinking ahead on this one, so bear with me. Is there a dd command that will always backup the GPT partition table at the tail of the drive? I've tried to find a way to backup the GPT partition tables since there's been a few users in the past that wished they had backed up their GPT tables. But I don't know how you tell d to pull from the end of the disk without actually doing all the math yourself(which I'd expect many users couldn't do on their own). I had thought about creating some kind of How-To to backup GPT tables and backing up the beginning GPT partition table is easy, but the one at the tail is proving to be hard to put in a one-size-fits-all command line.

FlynnVT · Aug 20, 2013

cyberjock said:
FlynnVT, you seem to have more experience with the command line than I.

Just thinking ahead on this one, so bear with me. Is there a dd command that will always backup the GPT partition table at the tail of the drive? I've tried to find a way to backup the GPT partition tables since there's been a few users in the past that wished they had backed up their GPT tables. But I don't know how you tell d to pull from the end of the disk without actually doing all the math yourself(which I'd expect many users couldn't do on their own). I had thought about creating some kind of How-To to backup GPT tables and backing up the beginning GPT partition table is easy, but the one at the tail is proving to be hard to put in a one-size-fits-all command line.

Unfortunately dd doesn't have a "negative seek" type option, so you need to supply the start address. The math from the diskinfo size could be done on the shell through a pipe with sed/awk, backticking into the dd seek/skip. The more legible/modern way would probably be though a PERL or Python script. It's nothing specific to disc operations though.

For a GPT backup, my understanding is that secondary/tail 16KB GPT chunk will be identical to the (head+512) primary 16KB GPT chunk (I'd like to check this). That means you only have to take a single backup and it'd be the restore operation that would need dd arithmetic to drop to the correct location. That said, you could probably side step that by simply restoring the PMBR and primary GPT and then using "gpart restore" to fix the other one, kind of like I've suggested for Thomas in the post above.

Having a GPT backup would definitely be better than not having one, but all FreeNAS data discs created through the GUI will essentially be the same. Worst case, you can find the partition and recreate from scratch, but it's a very command-line centric process. Maybe that could be automated? I'm not sure what scenarios this could protect against, or exactly what could be restored if it's a bigger disc/data failure rather than what is hopefully Thomas' situation where just the head of the disc was clobbered.

Thomas · Aug 20, 2013

Thanks for your input once more! Still, I'm not going to try any of it until you tried it ;) I did try to backup the GTP of the disks though, which succeeded for the heads, but I got an error when trying to copy the tail:

Code:

[root@freenas] ~# dd if=/dev/ada0 of=/mnt/usb/ada0_gpt_tail.dat bs=512 count=32 seek=1953525134
dd: truncating /mnt/usb/ada0_gpt_tail.dat: No space left on device

I used diskinfo to determine the sectorsize and subtrackted 34, just like you said.

Thomas · Aug 20, 2013

Bullocks, edit really is broken in FF -_- Well anyway: I wanted to say that I used an 8GB USB stick. and the command leaves a 0kb file on the device.

FlynnVT · Aug 20, 2013

No problem. I'll aim to try the PMBR & gpart recover method, assuming that the secondary GPT is OK. Creating the table from scratch is not impossible, but a lot more work for both of us. Plus, the risk is minimal if you don't mistype or choose the wrong devices!

My fault with the tail dd: the seek/skip is in units of the block size, not bytes. That command may have seeked beyond the end of the source.

Edit: actually, no that's OK. Diskinfo should report units of 512b sectors too. 1953525134 + 34 = 1,953,525,168 / 2 = 976,762,584 KB ~= 1TB, which is OK.

Is the 8Gb stick definitely mounted correctly, with enough space? (run "df") Those two dd output files should only be 16KB each.

cyberjock · Aug 20, 2013

I was thinking of making a script where you simply provide the disk you want to backup and it spits out a backup file. Then the reverse; you provide the backup file and tell it what drive and it restores the GPT. I imagine it could be reasonable to only backup the front end GPT and then run a command like gpart recover to restore the back end one.

I agree that the GPT is supposed to be identical on the front and back end, but then I have to wonder how both happened to get trashed with the VM. The whole point of having both was to provide redundancy, but something went wrong. It's just very bizaare. I know someone else that broke down to the sector level and is trying to do what you guys are doing, but they had no success. They could have made a mistake with their math or something, but since I wasn't part of their troubleshooting and recovery plan(they just provided that they attempted to reconstruct the GPT and it didn't work) I don't know what the could have potentially did wrong.

/shrug

Just ideas and watching...

FlynnVT · Aug 20, 2013

@cyberjock: here you go, using arithmetic operations on the shell. I'm not sure if BSD has blockdev, but you can replace (blockdev *) with an appropriate (diskinfo | grep | sed/awk) or some other command: http://superuser.com/questions/1288...stest-to-dd-to-the-last-512-kilobytes-of-disk

Right now, I'm hoping that the problem (VM related or not) simply knocked out the first X sectors of the disk. My previous post points to a section of the gpart man page where is mentions that FreeBSD won't even scan for the primary/secondary GPT if the MBR isn't present. Corrupting the MBR (sector 0) isn't an unimaginable bug or failure mode.

cyberjock · Aug 20, 2013

FlynnVT said:
Right now, I'm hoping that the problem (VM related or not) simply knocked out the first X sectors of the disk. My previous post points to a section of the gpart man page where is mentions that FreeBSD won't even scan for the primary/secondary GPT if the MBR isn't present. Corrupting the MBR (sector 0) isn't an unimaginable bug or failure mode.

That would explain why the end GPT isn't used, even if it is valid. Maybe there is a way to force a program to check the GPT at the end of the disk and use that. Maybe even something that isn't FreeBSD based, as long as it works.

FlynnVT · Aug 20, 2013

@cyberjock: I feel that you're skimming over my previous posts, rather than fully reading everything. The kernel's response to various states of MBR and primary/secondary corruption and availability of "gpart recover" is detailed above.

Onto the main event: I made a FreeNAS VM in VirtualBox:

1x 512GB UFS HDD for constant storage (ada1)

3x 1TB ZFS-Z1 pool for testing, using all the FreeNAS GUI defaults (ada2, ada3 & ada4)

I then booted to Linux and dd'd varying amounts of /dev/zero over the discs. FreeBSD's geom subsystem had locked exclusive access to the base raw /dev/adaX devices and I didn't feel like figuring that out tonight:

ada2 for 512 * 32 bytes (all the primary GPT)

ada3 for 512 * 1 bytes (just the PMBR)

ada4 for 512 * 1000 bytes (all the primary GPT, plus some of the swap partition)

Back in FreeNAS afterwards, (ada2, ada3, ada4) were missing from "gpart list". No zpools available. No lines from "dmesg | grep -i corrupt".

Rewrite the PMBR to each disk in FreeNAS:

dd if=/boot/pmbr-datadisk of=/dev/ada2 (bs=512 count=1, optional as long as the pmbr-datadisk file is 512b)

dd if=/boot/pmbr-datadisk of=/dev/ada3

dd if=/boot/pmbr-datadisk of=/dev/ada4

Reboot FreeNAS.

During bootup, I saw kernel lines saying that ada2 and ada4 had corrupt primary GPTs. There was no mention of ada3 problems - as you'd expect. The ada2 and ada4 problems were shown in "dmesg | grep -i corrupt" after boot.

Fix the two corrupt GPTs:

gpart recover /dev/ada2

gpart recover /dev/ada4

I didn't reboot here, as these legal gpart commands had kicked the kernel to rescan, unlike dd. (ada2, ada3, ada4) were then showing OK in "gpart list" ...best of all, I was able to auto-import the ZFS tank. It also scrubbed with no errors.

So: 100% complete and easy recovery as long as 1) the secondary GPT at the tail of the disc is intact and 2) the ZFS filesystem itself is OK.

In the case of ada4 here, I didn't check if FreeBSD was actually happy with the swap partition. I seem to remember using a Unix OSs that needed a header to this too, while others some use it raw. In any case, it's non-critical and would simply be a standard non-ZFS BSD command to kick it back into place.

@Thomas: hopefully this will fix your system. If you're happy with the head and tail backups you already have, then there's little risk to this step. Your next worry will be what ZFS makes of the FS if you have an intact tail GPT.

To answer another question I set for myself:

Code:

[root@freenas] /mnt/backups# od -c ada2_16kb_head.dat | head -n 30
0000000    E  F  I      P  A  R  T  \0  \0 001  \0  \  \0  \0  \0
0000020    S 277 024 254  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0
0000040  377 377 377  |  \0  \0  \0  \0  "  \0  \0  \0  \0  \0  \0  \0
0000060  336 377 377  |  \0  \0  \0  \0 326  % 005 331 303  \t 343 021
0000100  243 273  \b  \0  '  9  F 346 002  \0  \0  \0  \0  \0  \0  \0
0000120  200  \0  \0  \0 200  \0  \0  \0 034  % 303  r  \0  \0  \0  \0
0000140  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001000  265  |  n  Q 317  n 326 021 217 370  \0 002  -  \t  q  +
0001020    % 312 027 331 303  \t 343 021 243 273  \b  \0  '  9  F 346
0001040  200  \0  \0  \0  \0  \0  \0  \0 177  \0  @  \0  \0  \0  \0  \0
0001060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001200  272  |  n  Q 317  n 326 021 217 370  \0 002  -  \t  q  +
0001220  232  H  . 331 303  \t 343 021 243 273  \b  \0  '  9  F 346
0001240  200  \0  @  \0  \0  \0  \0  \0 336 377 377  |  \0  \0  \0  \0
0001260  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0040000

Code:

[root@freenas] /mnt/backups# od -c ada2_16kb_tail.dat | head -n 30
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001000  265  |  n  Q 317  n 326 021 217 370  \0 002  -  \t  q  +
0001020    % 312 027 331 303  \t 343 021 243 273  \b  \0  '  9  F 346
0001040  200  \0  \0  \0  \0  \0  \0  \0 177  \0  @  \0  \0  \0  \0  \0
0001060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001200  272  |  n  Q 317  n 326 021 217 370  \0 002  -  \t  q  +
0001220  232  H  . 331 303  \t 343 021 243 273  \b  \0  '  9  F 346
0001240  200  \0  @  \0  \0  \0  \0  \0 336 377 377  |  \0  \0  \0  \0
0001260  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0040000

...the primary GPT 16KB table at head+512 isn't identical to the secondary GPT 16KB table at tail-16KB. The center (entries) are the same, but the header is seemingly swapped in location. This is actually shown clearly in the diagram here: http://en.wikipedia.org/wiki/GUID_Partition_Table

No need for panic, as "gpart recover" can readily juggle primary and secondary, automatically cloning out the "bad" with the content from the "good". If you arrived at a situation where you had two "good" ones, you could choose which to keep by selectively zeroing out ther other prior to running "gpart recover".

Hope this helps somebody.

titan_rw · Aug 20, 2013

FlynnVT said:
Trivia lesson: endian issues aside, the ZFS tree root (or "uberblock") is marked with "00BA B10C", while the key/value pairs are "7A11 DA7A B10C" ("tall data block"?). Hex speak is the highest form of humour :)

Didn't notice the "tall data block". I think it's purposely "00BAB10C", as that's (roughly) "uba bloc". Nice to know the zfs programmers had a sense of humor.

If it's just the GPT that's messed up, would recreating the gpt with the command line help? When someone inadvertently exports a pool with the 'wipe disks' option, freenas wipes the start and end of the disks, destroying both copies of the gpt. I had recovered such a case by using gpart to recreate the gpt as it existed before and was able to reimport the pool.

Thomas · Aug 20, 2013

Sounds great! I just tried it, but now the system seems to hang and I can't look at the console to see whats wrong (long story). I'll have to leave it be for the moment and I'll have a look at it in the morning. :)

FlynnVT · Aug 21, 2013

titan_rw said:
If it's just the GPT that's messed up, would recreating the gpt with the command line help? When someone inadvertently exports a pool with the 'wipe disks' option, freenas wipes the start and end of the disks, destroying both copies of the gpt. I had recovered such a case by using gpart to recreate the gpt as it existed before and was able to reimport the pool.

Yes - I think recreating the GPT from scratch would be the next step if Thomas' GPT is entirely corrupted (post #22). Your experience would be great to have if it comes to that. Would the three commands in that post be sufficient to recreate the default FreeNAS label/layout?

Thomas · Aug 21, 2013

Not necessary, guess who has his data back? :D Your method worked flawlessly FlynnVT! Thank you thank you thank you! If you lived anywhere near Holland I'd thank you personally! :D

Ok, so for the technical part: the system wouldn't boot up at first after rebooting after the PMBR copy because it wanted to boot off of one of the ZFS disks, so thats only good news I guess because it shows the new PMBR is working haha!
I was still not able to backup the tail by the way, but I wasn't in a mind to delve deeper into that matter. After rebooting FreeNAS indeed reported the 3 disks to be corrupted and even explicitly mentions that it will use the secondary GTP:

Code:

GEOM: ada0: the primary GPT table is corrupt or invalid.
GEOM: ada0: using the secondary instead -- recovery strongly advised.
GEOM: ada1: the primary GPT table is corrupt or invalid.
GEOM: ada1: using the secondary instead -- recovery strongly advised.
GEOM: ada3: the primary GPT table is corrupt or invalid.
GEOM: ada3: using the secondary instead -- recovery strongly advised.
GEOM: da1s1: geometry does not match label (16h,63s != 255h,63s).
Trying to mount root from ufs:/dev/ufs/FreeNASs1a
GEOM_ELI: Device ada0p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI:     Crypto: hardware
GEOM_ELI: Device ada3p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI:     Crypto: hardware
GEOM_ELI: Device ada1p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI:     Crypto: hardware
ZFS filesystem version 5
ZFS storage pool version 28

When in FreeBSD I could simply use gpart recover and auto-import the pool! Worked flawlessly! The disks showed good in gpart list and gpart status:

Code:

[root@freenas] ~# gpart status
  Name  Status  Components
ada0p1      OK  ada0
ada0p2      OK  ada0
ada1p1      OK  ada1
ada1p2      OK  ada1
ada2s1      OK  ada2
ada3p1      OK  ada3
ada3p2      OK  ada3
 da1s1      OK  da1
 da1s2      OK  da1
 da1s3      OK  da1
 da1s4      OK  da1
da1s1a      OK  da1s1

It appears that all my data is still there and fully intact, but I'll have to double-check that. I also haven't run a scrub yet, I'm now backing everything up like crazy first! ;) But everthing looks good enough:

Code:

[root@freenas] ~# zpool status
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 0h9m with 0 errors on Sun Jul 28 00:09:19 2013
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/81b054d0-db49-11e2-bc54-000c29005847  ONLINE       0     0    79
            gptid/82483f11-db49-11e2-bc54-000c29005847  ONLINE       0     0     0
            gptid/82dbb69a-db49-11e2-bc54-000c29005847  ONLINE       0     0     0

errors: No known data errors
[root@freenas] ~#

I hope this topic will help others in the same scenario. And FlynnVT and cyberjock: you're heroes!

FlynnVT · Aug 21, 2013

Thomas said:
Not necessary, guess who has his data back? :D Your method worked flawlessly FlynnVT! Thank you thank you thank you!

That's really great news. I'm glad it worked out!

I see mention of encryption on the p1 partitions. Is this the swap area? The ZFS section wasn't encrypted, given that we found the "tall data blocks". Is this the default in FreeNAS?

It'd be great to get to the bottom of this corruption issue. Affecting 3 discs simultaneously in more than the 0 sector (the primary GPT was bad, not just the first 512b PMBR) is worrying.

That CHKSUM fault could indicate corruption due to clobbered sectors within the ZFS area. Fingers crossed that the redundant copies will fill in the blanks for you. The best bit with ZFS: if you read all your data and the status is still the same ("Applications are unaffected."), then you can be sure that you got everything back without corruption. Hard to beat that! Also, if the FS scrubs OK and the errors go away then there's no need to reformat and start over again - assuming you trust the physical discs.

cyberjock · Aug 21, 2013

The default configuration is a 2GB swap partition and all remaining space is for ZFS.

Strange, I posted a response to your comment that I was skimming. I definitely am not skimming as I am curious as to how this would turn out(seems it turned out to work). I've actually been doing googling of stuff as this thread has progressed to make sure I'm not making any stupid noob errors.

Important Announcement for the TrueNAS Community.

Pool gone after reset, import does not help

Dabbler

Dabbler

Dabbler

Dabbler

Dabbler

Inactive Account

Dabbler

Dabbler

Dabbler

Dabbler

Inactive Account

Dabbler

Inactive Account

Dabbler

Guru

Dabbler

Dabbler

Dabbler

Dabbler

Inactive Account

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Pool gone after reset, import does not help"

Similar threads