Unhealthy pool

Papid1975

Dabbler
Joined
Jun 29, 2020
Messages
40
One of my pools is marked “ONLINE (Unhealthy)”. I expected some kind of detailed information nearby or as a tool tip. Couldn’t find anything. What exactly is wrong and how can I fix this? What does TrueNAS mean by “unhealthy”?

Screenshot 2021-05-11 at 16.00.36.png
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
If you have a look under Storage | Pools and click the cogwheel to the right and then Status, you can see what's going on.

Actually, since you captured that screenshot from the dashboard, you can just skip straight to the cogwheel on the same "card" to get to the status.
 

Papid1975

Dabbler
Joined
Jun 29, 2020
Messages
40
Thank you for your answer. I clicked on Status, it shows me this:

Screenshot 2021-05-11 at 16.39.12.png


Now I know that it’s actually 2 errors, still no details though.
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
OK; so we'll need to switch to the CLI/shell to find out more.

if you go to the shell and type zpool status -v we should see something similar, but hopefully a bit more useful.
 

Papid1975

Dabbler
Joined
Jun 29, 2020
Messages
40
This was helpful, thanks!

Code:
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 02:03:12 with 2 errors on Sun May  9 02:03:14 2021
errors: Permanent errors have been detected in the following files:
rootdataset/childdataset1/grandchilddataset2@auto-2021-05-04_05-37:<0x1181>
rootdataset/childdataset1/grandchilddataset2@auto-2021-05-04_05-37:<0x4d3>


Does that mean the whole snapshot/dataset is broken? (even though it says “files” above)
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
Does that mean the whole snapshot/dataset is broken? (even though it says “files” above)
The errors appear to be metadata in the snapshot, so destroying the snapshot (both errors are in the same one) may be all that's needed.

zfs destroy rootdataset/childdataset1/grandchilddataset2@auto-2021-05-04_05-37 should do it
 

hartmut

Cadet
Joined
Mar 12, 2017
Messages
3
Does "zfs destroy <filename>" also work if there is only a Hex-Reference (like below)?

NAME STATE READ WRITE CKSUM
steinhart ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/7a40ba6f-6979-11ec-9984-1c872c6080f9 ONLINE 0 0 13
gptid/7a4b1f3d-6979-11ec-9984-1c872c6080f9 ONLINE 0 0 10

errors: Permanent errors have been detected in the following files:
<0x24bbb>:<0x18>
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
Does "zfs destroy <filename>" also work
No.

That's metadata and can only really be eliminated by recreating the pool.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Your first disk (gptid/7a40ba6f-6979-11ec-9984-1c872c6080f9) is reporting chksum errors. These are usually caused (but not always) by a bad cable. I suggest first reseating the existing SATA cable or (preferably) replacing it as they don't last well

glabel status will tell you which disk is which.

Please post your hardware spec
 

hartmut

Cadet
Joined
Mar 12, 2017
Messages
3
Thanks to both of you.
I will check (replace) the SATA cables and recreate the pool.
That should work fine and bring back a healthy status.
 
Joined
Sep 16, 2022
Messages
1
Hello. I have the same problem. I cleared the errors of a pool using "zpool clear HUS15K", where "HUS15K" is the name of my pool, command and it shows the same status as in OP.
Mounted the share into my OS, tried to access the "corrupted" files:
Code:
root@truenas-sc[~]# zpool status -v
  pool: HUS15K
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
config:

        NAME                                    STATE     READ WRITE CKSUM
        HUS15K                                  DEGRADED     0     0     0
          e7acf6ed-351b-4adc-bbf4-f2b02cdbae4f  DEGRADED     0     0     0  too many errors

errors: Permanent errors have been detected in the following files:

        /mnt/HUS15K/Share/Backups/Laptop/Pictures/2/2021/13.10.2021/nokia/DCIM/Pictures/RROnline/Screenshot_09-24-2021_15-23-05.png
        /mnt/HUS15K/Share/Backups/Laptop/Pictures/2/2021/13.10.2021/nokia/DCIM/Pictures/RROnline/Screenshot_09-24-2021_15-36-09.png
        /mnt/HUS15K/Share/Backups/Laptop/DMusic/videoplayback (18)kevin mcl.mp3
        /mnt/HUS15K/Share/Backups/Laptop/DMusic/Videoplayback (27)-1Dj Snake & Alesia - Bird Machine.mp3
        /mnt/HUS15K/Share/Backups/Laptop/DMusic/Videoplayback (14)-1DJ Snake and Lil Jon - Turn Down for What.mp3
        /mnt/HUS15K/Share/Backups/Laptop/DMusic/videoplayback (16).mp3
        /mnt/HUS15K/Share/Backups/Laptop/DMusic/videoplayback (17)Tobu - Candyland [NCS Release].mp3
        /mnt/HUS15K/Share/Backups/Laptop/DMusic/videoplayback (26)D1ofaquavibe - The Party Troll.mp3

And they are opening/playbacking, but the GUI still shows "Unhealthy". Can I just use "Scrub Pool" to erase those errors? Will it help?

Regards,
DSX.

P. S. That's not a problem with a SATA cable 'cause this pool is made out of x4 SCSI disks. I already know what cause it - RAM.
P. P. S. These files are not important at all (you see their names XD), so deleting them is an OK solution. But I tried this: "zpool status -v HUS15K" just showed a metadata name and the lines didn't disappear.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
@dsx33222 - you have a different problem.
1. Create your own thread
2. Post you hardware as per forum rules
 
Top