SOLVED! Help: Kernel Panic Crash. Able to import as readonly but not regularly.

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
Hi!
This is my first post, and I am in serious trouble with my dataset. Any help would be much appreciated.

I am on
Linux truenas 6.1.42-production+truenas #2 SMP PREEMPT_DYNAMIC Mon Aug 14 23:21:26 UTC 2023 x86_64

I am using
Dell Precision desktop computer with SAS card attached to 6x 6tb Seagate SAS HDD, and two 8tb HDD(which works fine so will not be further discussed)

My 6*6tb pool named "Six6Tbs" cannot be imported at powerup as it goes to kernel panic. I physically disconnected the drives so the computer can boot, and I can have access to a working shell. Here are the commands that works:

Code:
zpool import -o readonly=on Six6Tbs


sudo zfs load-key Six6Tbs
sudo zfs mount -o noxattr Six6Tbs
ls /Six6Tbs


sudo zfs mount -o noxattr "Six6Tbs/6t dataset"
ls '/Six6Tbs/6t dataset'


sudo zpool export Six6Tbs


zpool import -fFX -o readonly=on Six6Tbs


However, anything that is not readonly does not work. I also cannot set the readonly to off when it is already imported.

Any recommendations? Much thanks!
 

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
Updated to the latest RC release. Attached is an image for proof of what works

1698013574745.png


1698013677699.png



Attached is what happens when I import it regularly (not readonly) as it fails and enters panic:
1698014012171.png


No more terminal output. Below is GPU output and it crashes

IMG_20231022_183315.jpg



After it reboots, it fails and stuck on this page:

IMG_20231022_183532.jpg

It stays like this for over two days with no disk activity.

The only way to have it boot again is by unplugging a drive in the pool; to which i can regain a shell to tinker with.

Last notes: all 6 6tb drives are online and healthy. It is running raid z1 and datasets attach fine as readonly when only 5 drives connected too.
 

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
I'll venmo 10 dollars if there's something i can do without having to copy everything off the disks :) o_O:wink::mad::confused::tongue::rolleyes:
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
As I can see, the panic happens when ZFS tries to sync error log for the pool, hitting error trying to free some space, that is not supposed to happen and not so handled otherwise. It is difficult to say what is the cause, it could be some earlier pool corruption. I may speculate that the error log it is trying to sync may actually have something to do with the corruption. It does make sense that read-only import succeeds. Unfortunately I don't know a way to recover from this situation without data offload and pool recreation.
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
You would need a set of drives to copy your data too, whatever fits it all, and then rebuild a new pool. But you should have backups, no raidz level is a substitute for backups. If you do have backups, then destroy and rebuild pool then restore them. But if you don't, you need to be coming up with a plan to do so.
 

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
Thank you all very much! I have attempted setting readonly to off when the pool is mounted because someone on youtube has been able to do that, and export the pool properly, and that seems to fix the pool for the next input. Unfortunately, when I try to modify the readonly setting I am prompted I can only change such settings when attaching the pool; but if I do that the system freezes and enters panic.
 

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
Update: I see the exact same kernel panic error after updating to the latest. Will start attempting to make a copy off the disks.
I have 8 SAS plugs. 4x 8Tb and 6x 6Tb drives.
Here is the plan:
1. Move as much as possible onto the existing 2x 8tb pool.
2. Unplug a 6tb, plug in a new 8tb drive. Expand existing 2x8tb pool
3. Move some more stuff to the 3x8tb pool
4. Unplug another 6tb and plugin another 8tb, expand to 4x8tb pool.
5. Wipe and make new 4x6tb pool
6. Move data for backup.
7. Sell the 2x 6tb drives


Screenshot 2023-12-04 173936.png
 
Last edited:

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
Unable to copy. Or copy only to boot-drive.

Tree breakdown:
to
'Open_Stripe_8t/recovery_move'

from
'Six6Tbs/6t dataset/Secure'
'Six6Tbs/6t dataset/6tb level2 dataset'

Above is as displayed in the GUI web interface under Dataset.

Detailed log
  1. Unplug broken pool. Restart. Replug broken pool.
  2. Use previous commands to mount read-only and unlock the dataset all in the command line
    1. Note you cannot mount it to mnt/broken read only. It can only be from root.
  3. use rsync sudo rsync -av --progress /source/path /destination/path
Code:
root@truenas[~]#
root@truenas[~]# sudo rsync -av --progress /source/path/Six6Tbs/6t\ dataset/Secure/Open_Stripe_8t/recovery_move /destination/path

sending incremental file list
rsync: [sender] change_dir "/source/path/Six6Tbs/6t dataset/Secure/Open_Stripe_8t" failed: No such file or directory (2)
rsync: [Receiver] change_dir#3 "/destination" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at main.c(829) [Receiver=3.2.7]
root@truenas[~]# sudo rsync -av --progress 'Six6Tbs/6t dataset/Secure' 'Open_Stripe_8t/recovery_move'
sending incremental file list
rsync: [sender] change_dir "/root/Six6Tbs/6t dataset" failed: No such file or directory (2)
rsync: [Receiver] change_dir#3 "/root/Open_Stripe_8t" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at main.c(829) [Receiver=3.2.7]
root@truenas[~]#
root@truenas[~]# sudo rsync -av --progress 'Six6Tbs/6t dataset/Secure' 'Open_Stripe_8t/recovery_move'
sending incremental file list
rsync: [sender] change_dir "/root/Six6Tbs/6t dataset" failed: No such file or directory (2)
rsync: [Receiver] change_dir#3 "/root/Open_Stripe_8t" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at main.c(829) [Receiver=3.2.7]
root@truenas[~]# sudo rsync -av --progress 'Six6Tbs/6t dataset' 'Open_Stripe_8t/recovery_move'
sending incremental file list
rsync: [sender] change_dir "/root/Six6Tbs" failed: No such file or directory (2)
rsync: [Receiver] change_dir#3 "/root/Open_Stripe_8t" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at main.c(829) [Receiver=3.2.7]
root@truenas[~]# sudo rsync -av --progress 'Six6Tbs/6t dataset' 'mnt/Open_Stripe_8t/recovery_move'
sending incremental file list
rsync: [sender] change_dir "/root/Six6Tbs" failed: No such file or directory (2)
rsync: [Receiver] change_dir#3 "/root/mnt/Open_Stripe_8t" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at main.c(829) [Receiver=3.2.7]
root@truenas[~]# cd /
root@truenas[/]# sudo rsync -av --progress 'Six6Tbs/6t dataset' 'mnt/Open_Stripe_8t/recovery_move'
sending incremental file list
6t dataset/

sent 72 bytes  received 20 bytes  184.00 bytes/sec
total size is 0  speedup is 0.00
root@truenas[/]# sudo rsync -av --progress '/Six6Tbs/6t dataset' '/mnt/Open_Stripe_8t/recovery_move'
sending incremental file list

sent 73 bytes  received 17 bytes  180.00 bytes/sec
total size is 0  speedup is 0.00
root@truenas[/]# sudo rsync -av --progress '/mnt/Six6Tbs/6t dataset' '/mnt/Open_Stripe_8t/recovery_move'
sending incremental file list
rsync: [sender] change_dir "/mnt/Six6Tbs" failed: No such file or directory (2)

sent 19 bytes  received 12 bytes  62.00 bytes/sec
total size is 0  speedup is 0.00
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1338) [sender=3.2.7]
root@truenas[/]# sudo rsync -av --progress '/mnt/Six6Tbs/6t dataset' '/mnt/Open_Stripe_8t/recovery_move'


Spent more than a week on this. Any idea what is wrong? What the directory should be?
 
Last edited:

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
Here is the final solution.:wink::frown::mad::wink::rolleyes::smile::frown::frown::mad::confused::tongue::tongue:
1. import readonly
2. load key
3. mount
4. Repeat for all directory roots
5. open a tmux session
6. copy to healthy dataset
It is simple once you get the non-standard lingual like import vs load vs mount, and the weird / vs no/ directory. and that default mount is /mnt/DatasetName but your mount is /DatasetNameDirectly. and that rsync makes all missing directories without warning. Good luck recovering!

Another tip, when you get a lot of
"No such device or address (6) 6tb level2 dataset/G...exe failed verification -- update discarded (will try again). WARNING: 6tb level2 dataset...": Input/output error (5)"
Just reboot your whole system, repeat the process and it will be fixed. Likely failing source disk.

zpool import -o readonly=on Six6Tbs

sudo zfs load-key Six6Tbs
sudo zfs mount -o noxattr Six6Tbs
ls /Six6Tbs


sudo zfs mount -o noxattr "Six6Tbs/6t dataset"
ls '/Six6Tbs/6t dataset'


cd '/Six6Tbs/6t dataset/6tb level2 dataset'
//no need to load key, because inheritance

sudo zfs mount 'Six6Tbs/6t dataset/6tb level2 dataset'

sudo zfs load-key 'Six6Tbs/6t dataset/Secure'

sudo zfs mount 'Six6Tbs/6t dataset/Secure'

//DONE attaching!

tmux

sudo rsync -av --ignore-existing --ignore-errors --exclude='6tb level2 dataset/BASSEST all files/*' '/Six6Tbs/6t dataset/6tb level2 dataset' '/mnt/Open_Stripe_8t/try3_6tblv2'

tmux attach-session
 
Last edited:

Haibane

Dabbler
Joined
Oct 22, 2023
Messages
18
For anybody who has experienced this issue, because i am not the only one, here is some help:
1. Check if your 5v power rail for the SAS drives are failing/insufficient. That was the cause for me.
2. The replicated part of the corrupted files will also cause kernel panic. It cannot be saved unless you copy the file to a different system.
3. Whatever corrupted files are mounted with read-write, Simply cd-ing to it via the web page shell, or Clicking on it in SMB, will kernel panic.

good luck.
 
Top