Recover data from ZFS pool

reznor244

Cadet
Joined
Dec 16, 2023
Messages
5
Hello,

Background:
First off, let me say thanks in advance and admit that I am a noob to NAS and zfs and undoubtedly broke rules in configuring things. I have TrueNAS running as a virtual machine in Proxmox and one day my server went down hard. I believe I had the pool imported in both Proxmox and TrueNAS (now I know better to use shares). After that, I could not get TrueNAS to boot because it fails to import the pool. At this point, I believe the pool to be completely broken and have no hope to fix it, but I would like to recover as many files as I can.

What I've tried:
I followed the steps here: https://www.truenas.com/community/threads/zfs-has-failed-you.11951/ to boot TrueNAS using Escape to Loader and boot to shell. I have then imported the pool using
Code:
zpool import -o readonly=on -fR /mnt/pool pool
. The import succeeds, but I can't find any of the files in
Code:
/mnt/pool
where I would expect them.
Code:
zpool status -v
returns the following:
Code:
  pool: pool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 240K in 00:18:59 with 0 errors on Sun Nov 12 02:18:59 2023
config:

        NAME        STATE     READ WRITE CKSUM
        pool        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb2    ONLINE       0     0     4
            sda2    ONLINE       0     0     4

errors: Permanent errors have been detected in the following files:

        pool:<0x800e>
        pool/.system/rrd-b85043590a434da692c02c2416a40e36:<0x22>
        pool/.system/services:<0x0>


Attempting to mount without
Code:
-o readonly=on
causes the zpool command to hang and nothing helps short of rebooting. I probably have killed the pool, but is there any chance to get my files back? I am confused about why the pool imports and reports ONLINE but the files are not there.
 
Joined
Oct 22, 2019
Messages
3,641

reznor244

Cadet
Joined
Dec 16, 2023
Messages
5
Thank you for the response. I just executed the suggested command and get "cannot iterate filesystems: I/O error" but then it does show some information, including the datasets I had created.

Code:
zfs list -t filesystem -r -o space
cannot iterate filesystems: I/O error
NAME                       AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool                       6.77T   378G        0B    210G             0B       168G
pool/.system               6.77T  1.09G        0B    854M             0B       258M
pool/.system/cores         1023M  1.09M        0B   1.09M             0B         0B
pool/.system/webui         6.77T   200K        0B    200K             0B         0B
pool/iocage                6.77T  10.2M        0B   8.98M             0B      1.17M
pool/iocage/download       6.77T   200K        0B    200K             0B         0B
pool/iocage/images         6.77T   200K        0B    200K             0B         0B
pool/iocage/jails          6.77T   200K        0B    200K             0B         0B
pool/iocage/log            6.77T   200K        0B    200K             0B         0B
pool/iocage/releases       6.77T   200K        0B    200K             0B         0B
pool/iocage/templates      6.77T   200K        0B    200K             0B         0B
pool/time-machine          6.77T   167G        0B    208K             0B       167G
pool/time-machine/brandon  6.77T   167G     2.76G    165G             0B         0B
 
Joined
Oct 22, 2019
Messages
3,641
I believe that by having the ZFS pool imported and used by both Proxmox and TrueNAS simultaneously, you might have hosed some important metadata, and hence killed your pool. :confused:

You might try to do a "recovery" import. YMMV.
 

reznor244

Cadet
Joined
Dec 16, 2023
Messages
5
Thank you for all your help. I am giving up on fixing this pool or even reading the data. Fortunately I do have copies of everything that was on the pool, just not all in one spot.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
@reznor244 I'm happy to hear that you do have copies, but if you want to take a last shot at this before sending it to the bit-bucket in the sky:

Set these two sysctl's to speed up pool rewind on import
Code:
sysctl vfs.zfs.spa.load_verify_metadata=0
sysctl vfs.zfs.spa.load_verify_data=0


then do zpool import -FXfn pool and after some time it should kick back

Code:
Would be able to return pool to its state as of (INSERT DATE AND TIME HERE)
Would discard approximately N seconds of transations.


Make your judgement call if you can live with that, then issue
zpool import -FXf pool

If it imports, reboot the system through the webUI, make sure your pool re-imports automatically, and fire off a scrub.

Did you by any chance pass your HBA/storage controller through to Proxmox? ;)
 

reznor244

Cadet
Joined
Dec 16, 2023
Messages
5
@HoneyBadger Thank you very much, but I've already blown away the pool and created a new one. I wish I would've waited long enough to try your comment, mostly out of curiosity as I hadn't seen some of those options in other threads. I've already found all of the data on backups and am copying to the new pool, so no loss.

Yes, I am passing my drives to my TrueNAS VM by ID / serial number in Proxmox. The bad thing I did was importing the zfs pool directly in both TrueNAS and Proxmox and then sharing into VMs/containers. Now I am doing things better (hopefully) by only importing in TrueNAS and then setting up NFS shares which are used for other VMs/containers to access.

I plan to do a proper backup to after restoring everything rather than having my files spread across multiple platforms.
 

victort

Guru
Joined
Dec 31, 2021
Messages
973
@HoneyBadger Thank you very much, but I've already blown away the pool and created a new one. I wish I would've waited long enough to try your comment, mostly out of curiosity as I hadn't seen some of those options in other threads. I've already found all of the data on backups and am copying to the new pool, so no loss.

Yes, I am passing my drives to my TrueNAS VM by ID / serial number in Proxmox. The bad thing I did was importing the zfs pool directly in both TrueNAS and Proxmox and then sharing into VMs/containers. Now I am doing things better (hopefully) by only importing in TrueNAS and then setting up NFS shares which are used for other VMs/containers to access.

I plan to do a proper backup to after restoring everything rather than having my files spread across multiple platforms.
I’m going to jump in here and say you should not be passing the drives through individually. It should always be done with a HBA card.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
@HoneyBadger
Yes, I am passing my drives to my TrueNAS VM by ID / serial number in Proxmox.
I suggest you read the link under the Useful Links section of my signature specifically point #4. That's a recipe for another disaster in the future.

For the record, I too am virtualizing TrueNAS CORE through Proxmox, but I am not passing my drives individually like that though.
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
As mentioned by some of the other users, the correct method is to use IOMMU/PCI Passthrough to assign an HBA or storage controller exclusively to the VM itself. See the blog article "Yes, you can (still) virtualize TrueNAS" for some additional details.

Proxmox presents an additional challenge over VMware, because it also understands ZFS and will occasionally attempt to import the pool on its own. I've seen this pop up more recently of late, so I may need to do some experimentation as behavior may have changed in a new Proxmox release that's causing it to import ZFS pools on boot. If that's the case, even PCI passthrough alone may not be sufficient, as a cold-boot of the host would result in Proxmox mounting the pool, then having it forcibly removed when the storage controller gets passed through to the TrueNAS VM on its boot; a great way to have an inconsistent pool state, or unnecessarily having to do ZIL replays.
 

reznor244

Cadet
Joined
Dec 16, 2023
Messages
5
This is a deep pool to jump into (pun intended), so I appreciate all the help. it seems I have read some bad advice (on other forums ) that said passing the drives individually.

I will look at HBA cards. Will I have to start my pool over again or can I just pop the card / drives in and point TrueNAS at the new location of the pool? Either way I will be doing a proper backup of all the data before disaster strikes this time.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
I will look at HBA cards. Will I have to start my pool over again or can I just pop the card / drives in and point TrueNAS at the new location of the pool? Either way I will be doing a proper backup of all the data before disaster strikes this time.
This is actually the point of doing it this way (passing the whole controller through). This way, if a disaster happens, you can just put the drives into another functioning computer, boot it up, load up TrueNAS on it, restore your config and you should be back up online as if nothing happened. Or you could also get rid of the hypervisor, run bare metal (using the same steps above) on the same computer and it should function as normal with or without the hypervisor. It makes your setup more portable and robust. The link I mentioned above actually points this fact out.

As to whether or not you have to restart the pool or not, I'm not sure cause honestly, I've never passed the drives individually like that, so I don't have enough experience to say if it will succeed or not.
 
Top