Corrupted Pool - KDB: enter:panic - can only restore ReadOnly

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7

System #1 - Original (During Failure)

ASrock E3C224D41-14S (on board LSI IT 8x HDDs)
32gig Crucial 1600 ECC Ram
TrueNAS USB 12-U4 I believe
System running ~4-5yrs (Just normal HDD failures no huge drama)

Failure:​

A few days ago my system began to reboot constantly after an apparent HDD fail.
It was a panic error causing it to cycle.
I had assumed it was some hardware (RAM/Mobo/USB Drive/HDD etc).
Was able to boot into TrueNAS and began a Scrub and believe it rebooted during it (looks like 39.44% base off status below).
I had plans to upgrade anyway so moved everything over and reinstalled TrueNAS etc on some unused gear.
At this point I had upgrade to TrueNAS 13.0-U4 hoping it was a USB issue.

System #2 - Trouble Shooting​

Asus Crosshair Formula VIII
AMD 3900x
LSI 1920-8i firmware 20.x in IT mode.
TrueNAS 13.0-U4 Samsung 840 SSD

The same symptoms persisted with new system which helped eliminate a lot of components.
I think I had a bad drive so I replaced the suspected bad drive (spare on hand) and also tried unplugging one drive at a time (no changes).

This thread was helpful to try to get my pool back.
I have uploaded some screen caps of the results and the panic issue I am getting..... only 1200 data errors....

20230425_225612.jpg


If I did a normal import I would get the panic shown below.
20230425_193541.jpg


So I can load the read-only pool just fine but I'm not sure the next steps if I have any options left.....
I have snapshots but never rolled it back from the console but I did try (say unable due to read-only pool).
The network is also not connected as I can't get a full boot with out a panic.
I also don't want to make any changes to the pool if there is any saving this data.
I was able to get it to try to load through the GUI once but it of course panic and no different than console (attached).
Looking for next steps on help with this one.

Thanks for your time -
 

Attachments

  • Capture.PNG
    Capture.PNG
    40.5 KB · Views: 74

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7
Before doing anything I would back up any important data.
I would completely agree but this is where I am having issues.

I can only seem to mount it under single user mode and in read-only.

Because of this - I cannot access the information as usual from what I can tell.

What would be the best mode to back the data up/retrieve it?

When I do try to mount it under the shell in the GUI it panics even with read only it panics so my SMB shares would not be available as typical.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Not looking good.
Maybe you could try a resync with an external HDD.

Possibile line of action (just laying it down, not suggesting to run it yet):
zpool scrub -s [poolname] stops the scrub, save config, export pool, reinstall OS, import pool, upload config.
 
Joined
Jun 15, 2022
Messages
674
"TrueNAS 13.0-U4 Samsung 840 SSD" "I can only seem to mount it under single user mode and in read-only."

Correct me if I'm wrong, but doesn't this sound like the OS boot drive is SSD (flash memory) and has failed, therefore going into Read-Only mode?

If this is the case the solution perhaps something like:
  1. add a new SSD,
  2. boot a TrueNAS ISO from a USB stick and install it on the new SSD,
  3. boot the new SSD can copy the Config off the bad SSD to the good SSD,
  4. remove the bad SSD,
  5. boot the new SSD,
  6. check things didn't break.
 
Last edited:

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7
"TrueNAS 13.0-U4 Samsung 840 SSD" "I can only seem to mount it under single user mode and in read-only."

Correct me if I'm wrong, but doesn't this sound like the OS boot drive is SSD (flash memory) and has failed, therefore going into Read-Only mode?

If this is the case the solution perhaps something like:
  1. add a new SSD,
  2. boot a TrueNAS ISO from a USB stick and install it on the new SSD,
  3. boot the new SSD can copy the Config off the bad SSD to the good SSD,
  4. remove the bad SSD,
  5. boot the new SSD,
  6. check things didn't break.
I was running originally on a USB (System #1) when the issue happened.
Thinking the same - possible corrupt I migrated my system to another box I was going to do in the future.
The 2nd machine has that SSD where I did a fresh install on. (Same results - panic when mounting pool).
 

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7
Not looking good.
Maybe you could try a resync with an external HDD.

Possibile line of action (just laying it down, not suggesting to run it yet):
zpool scrub -s [poolname] stops the scrub, save config, export pool, reinstall OS, import pool, upload config.
I'm wondering if stopping that scrub would allow it to be mounted or maybe start it to allow it to finish with the new drive?

I'm not sure how dangerous that is and am being cautious trying to find the best steps on how to recover the pool.
 

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7
If you're looking to rewind to checkpoint or rollback to a TXG with -T, you need to do that as part of the import command:
Do I need to have the feature flags enabled to use a checkpoints?

I currently do not have some feature flags enabled as I don't think I upgraded my pool.

How do I view the TXG's to know which one to roll back to?

I attached a history of the pool but not sure if that is of any help or not.

It looks like I did a scrub on the day of the failure (automatic at 8am Sunday) but it seemed fine and failed later in the day (10:29).

FYI - I was able to unplug the HDD log in through the GUI then do a readonly import.

Unfortunately the Pool just gives me an error (my guess is due to read-only?)
 

Attachments

  • 20230426_172104.jpg
    20230426_172104.jpg
    662 KB · Views: 73
  • GUI Pool Error.png
    GUI Pool Error.png
    14.6 KB · Views: 63

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7
Ok - I was able to offer a blood sacrifice and it appears that I have been granted an audience with my Pool under the grounds of read-only......

1682641196941.png


The SMB and other functions are not able to start.

What would be the best thing to do to retrieve my data?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

Mirdain

Cadet
Joined
Apr 25, 2023
Messages
7
Just an update for anyone that runs into same issue.

The critical thing I did not figure out for a while is after you mount your pool in read-only mode via CLI you have to type "exit" to then complete booting.

If I rebooted it must not keep the read-only flag and would always panic.

Once I was able to get to the GUI I was able to do replication tasks in a new pool I made to get all my data back.
 
Top