Reinstalling/Upgrading FreeNAS 9.10.2-U6 to 11.1-U6.3 ruined my zpool :(

phast1

Cadet
Joined
Jan 6, 2019
Messages
5
So, this appears to be a fairly common problem since I spent the last day of my life searching about it and did find others that had the exact same results, but I couldn't find any specific solutions and it sounds like I need to manually recreate some partition info for the zpool to have any hope of recovering my nearly 15TB of data :( Unfortunately I'm having a hard time finding what to do to recreate the zpool info needed, so I'm now resorting to posting here and hoping someone can help, which I am willing to pay for if needed.

Hardware: Supermicro mobo, Xeon E3-1231v3, 16GB ECC RAM, 4x 6TB (HGST HDN726060ALE614) ZFS RAID array (raidz1 I think? my usable space is basically 3x 6TB)

I was running 9.10.2-U6 on a USB drive for years, which suddenly got unrecoverable errors the other day, so I figured this was a good time to upgrade to 11.1 and decided to put it on a 250GB SSD this time since I noticed one of the features of 11 is the ability to finally run VMs on this server and I've had lots of similar USB drive issues in the past that I'm hoping to avoid by going with a real SSD.

Side note: I didn't realize at the time that 11.2 was released as stable now, since I had just went straight to https://download.freenas.org/11/latest/x64/ and got the ISO from there.

So, I removed the USB drive (still have it in case something can be recovered from it), put the 11.1-U6.3 ISO on a different USB drive and proceeded to install on the SSD. I definitely picked the right drive when asked where to install, and everything seemed to go smoothly, except my zpool was now missing and not showing up in the GUI for importing, nor does it show up in the CLI using commands like "zpool import" or "zpool status".

Code:
[root@freenas] ~# zpool status
  pool: freenas-boot
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          ada4p2    ONLINE       0     0     0

errors: No known data errors
[root@freenas] ~# zpool import
[root@freenas] ~# zpool import -f freenas
cannot import 'freenas': no such pool available


After further research, I checked "gpart show" and can see that all of the partitions are still there:

Code:
[root@freenas] ~# gpart show
=>         34  11721045101  ada0  GPT  (5.5T)
           34           94        - free -  (47K)
          128      4194304     1  freebsd-swap  (2.0G)
      4194432  11716850696     2  freebsd-zfs  (5.5T)
  11721045128            7        - free -  (3.5K)

=>         34  11721045101  ada1  GPT  (5.5T)
           34           94        - free -  (47K)
          128      4194304     1  freebsd-swap  (2.0G)
      4194432  11716850696     2  freebsd-zfs  (5.5T)
  11721045128            7        - free -  (3.5K)

=>         34  11721045101  ada2  GPT  (5.5T)
           34           94        - free -  (47K)
          128      4194304     1  freebsd-swap  (2.0G)
      4194432  11716850696     2  freebsd-zfs  (5.5T)
  11721045128            7        - free -  (3.5K)

=>         34  11721045101  ada3  GPT  (5.5T)
           34           94        - free -  (47K)
          128      4194304     1  freebsd-swap  (2.0G)
      4194432  11716850696     2  freebsd-zfs  (5.5T)
  11721045128            7        - free -  (3.5K)

=>       34  488397101  ada4  GPT  (233G)
         34          6        - free -  (3.0K)
         40     204800     1  efi  (100M)
     204840  488192288     2  freebsd-zfs  (233G)
  488397128          7        - free -  (3.5K)


However, the zpool info seems to be gone, including any labels, which I don't think I ever created any specific labels in the past, but I hear it's bad if zbd doesn't see any, such as:

Code:
[root@freenas] ~# zdb -l /dev/ada0
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3
[root@freenas] ~# zdb -l /dev/ada0p2
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3


(all the other drives gives the same results, except the ada4p2 on the 250gb drive with the running FreeNAS on it)

I then tried going back to FreeNAS 9.10.2-U6 hoping that it was just an issue with 11 not being able to see my particular zpool, but unfortunately the results are the same there now too. I even tried restoring my FreeNAS config backup, which then shows my "freenas" zpool volume in the GUI, but it's in a LOCKED state and fails to unlock, since it doesn't actually exist anymore.

Other notes:

The data on this volume isn't mission critical and I didn't have anywhere big enough to back it all up to, so no, I don't have any backup that I could use to restore from, but it's still a lot of data that I really don't want to lose so I really hope someone here is able to help me recover it.

I'm not sure if the volume was encrypted, but I did see geli mentioned when I ran "fstyp -u /dev/ada0p2", so I think it may be and I don't remember making any encryption key backup, although maybe the USB drive that failed could have something recovered from it if absolutely necessary, or maybe my FreeNAS config backup has it?

I searched for ZFS recovery software and doesn't seem like there is much available on that end, other than https://www.r-explorer.com/ that looks promising, but I don't want to just start blindly trying things and make the situation worse than it already is.

The most concerning thing about all this is why did installing FreeNAS 11.1 do this to me? Did I do something wrong or is it some bug? Looking back, I guess I should have just unplugged the 6TB drives while doing the new install and I'd probably be fine, but I had no idea that a new FreeNAS install had such risks and have never experienced this before when reinstalling FreeNAS.

Thanks in advance for any help getting me out of this bad situation.
 

phast1

Cadet
Joined
Jan 6, 2019
Messages
5
I can't believe it, but I just fixed it and got my data back! The solution wasn't what I expected though, since I had already tried reverting to 9.10.2 and restoring my config backup, so I was pretty convinced that it was something trashed with the zpool on the drives themselves and was going to need a complicated gpart or similar solution to repair it, based on everything I read online over the past day.

But nope, what I ended up doing was putting my corrupt USB drive with my original FreeNAS 9.10.2 on it into my desktop Linux PC (actually Qubes OS) and did a dump to a file using dd like this:

dd if=/dev/xvdl of=freenas-boot-corrupt.bin iflag=direct conv=noerror,sync

And then I just dumped it back to a different USB drive like this:

dd if=freenas-boot-corrupt.bin of=/dev/xvdl

After that, I just put it back into my FreeNAS server and booted onto it successfully, including my lovely zpool volume in full working order, yay!!

So yeah, next step: get a working backup solution in place immediately, since I evidently value this data more than I thought I did, haha

In any case, I am still confused about why this happened in the first place and hope someone can shed some light on what I did wrong so I can try to avoid it the next time I attempt to reinstall/upgrade to 11.x
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
Create a backup you you data, is a must.
What you are miissing is that you did a fresh install and config is not available.
Now you have you SSD or USB, run an update to 11.2 and see how it goes.
If going the brand new install to 11.2, once system is up, do an import of the volumes from the Storage section and select all the disk. Make sure to select import and not starting from scratch. Very slippery road in this area.

Are you volume encrypted?
If not, upon starting imort, it will give you the volume it detects, procedd through all of them one at a time. It will take time, so be patient.
After that when your volume are imported, you will have to set user accounts, shares and the like as they used to be.
Ideally, upgrade from one release to ne next is the better choice. You will always have the ability to upgrade,and f upgrade fails you can still reboot from previous stable boot load config.
 

phast1

Cadet
Joined
Jan 6, 2019
Messages
5
Thanks very much for the info!

I ended up getting 4 more drives in the server and set them up as a 4x4TB stripe volume, which I then setup backups to this new volume using the snapshot/replication feature. This all went well and I now have 2 copies of my data, yay :)

I then decided to try the in-place upgrade to 11.2 STABLE using the web gui. This also went well for the most part, it came back up on 11.2 with working pools, etc. The only problem now was that plexmediaserver wasn't online, although the jail was running, jexec into it worked, but I noticed it couldn't ping anything from inside the jail.

I saw the ZFS pools needed an upgrade, so I thought upgrading them and rebooting the server was probably a good idea before putting much more effort into the plex situation.

This is when everything went downhill, since it wouldn't boot back up :( Originally it was in a reboot loop where it would reset as soon as it tried to boot off the USB, so I tried forcing a UEFI boot and it at least attempted to boot then but gave Invalid Format! error along with BTX halted.

I searched Google and found a couple other people that experienced this after an 11.x upgrade, but the only solution I found mentioned (other than setting AHCI on the SATA controller, which was already set) was to get a separate USB with the 11.2 install ISO on it, and use the upgrade option pointed at the original USB. This sounded reasonable, so I tried it, but unfortunately I couldn't find how to make it upgrade instead of fresh install, it seemed to only want to do a fresh install and I didn't want to overwrite the original USB, so I went ahead and just did a fresh install on the SSD drive for now.

So now I can either get this fresh install working with my pools, do a new install of plex, setup sharing, etc, or I can try to repair the original upgrade on the USB somehow. Any suggestions?

I looked in the Storage section on this fresh install and the only Import I can find is the "Import Disk" menu option, but I'm not convinced this is the right place to import my pools, since it only lets me select 1 disk (actually 1 partition on 1 disk), and asks for filesystem type and destination? I also went to "Disks" and tried selecting all the disks there, but it only gives a batch edit option when I do that?
 

phast1

Cadet
Joined
Jan 6, 2019
Messages
5
Nevermind on the pool import question, I RTFM and figured it out (had to go to Storage -> Pools -> Add and then there is an import option) and have successfully imported both pools into the fresh install.

Now to figure out how to get Plex running again, hmm..
 

phast1

Cadet
Joined
Jan 6, 2019
Messages
5
Ok, that was a pain, but I finally got Plex running now in a new iocage jail and all data restored from the old legacy jail, whew.

I think I should be all set now on this new install of 11.2 on the SSD drive, once I verify shares and backups are still working as expected, but they should be easy enough to fix if they aren't working already.

Thanks again for the help!
 
Top