ONLINE (Unhealthy),

Bolster3496

Cadet
Joined
Jun 13, 2022
Messages
8
Hi,

I'm fairly new to TrueNAS, this is my first real issue with it.
My pool is 2x6 4TB HDD, in raidz2.

Today i had an alert :

CRITICAL​

Pool storage0 state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.​


but when looking into it, i can't find anything disks related looking wrong in the webUI. Everything seems ok and is green color coded.

i ran zpool status to gather a bit more info, but nothing here either :

Code:
nas% zpool status -v storage0           
  pool: storage0
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 01:50:44 with 0 errors on Sun Aug  7 01:50:44 2022
config:

    NAME                                            STATE     READ WRITE CKSUM
    storage0                                        ONLINE       0     0     0
     raidz2-0                                      ONLINE       0     0     0
       gptid/7f229881-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f17c89d-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f0bf0fa-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f4cbe2c-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7ebe0f6d-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f2d3a93-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
     raidz2-1                                      ONLINE       0     0     0
       gptid/7f3d3b19-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7fa787fd-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f90a2e4-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     1
       gptid/7f9e357b-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f862fb5-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0
       gptid/7f7bc100-def1-11ec-a805-89b5c2ed1a1f  ONLINE       0     0     0

errors: No known data errors


Should i worry, and if so, how to investigate the issue further, or just run zpool clear ?
 

Bolster3496

Cadet
Joined
Jun 13, 2022
Messages
8
Sorry, i forgot to mention that pool is attached to an up to date Truenas core 13.0-U2, running on a VM with 32Gb of ram
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
That warning was ZFS telling you something happened. Keep an eye on that disk;
gptid/7f90a2e4-def1-11ec-a805-89b5c2ed1a1f ONLINE 0 0 1
Notice it has 1 checksum error. ZFS detected a problem, tried to fix it and was successful. Thus, you need to do nothing now.

If nothing happens in the next week or 2, do as ZFS says, zpool clear storage0, and keep track of that disk & error elsewhere.

That said, it's possible you have the drives passed to TrueNAS Core's VM in-correctly. Please list all your hardware and how you passed the drives to the VM.


These checksum errors are one of the key features of ZFS. All data & metadata is checksummed. Then on any and all reads, (like a scrub), the checksum is verified. If a block has a back checksum, the error is reported. And if redundant data is available, (like with Mirrors, RAID-Zx or "copies=2/3"), then the block is automatically attempted to be repaired. Which happened in your case.

If it were a bad disk block, a spare is used to restore fully redundancy. If a spare block is not available, then ZFS would list the error as un-recoverable. You would replace the drive. That un-recoverable error is much clearer and shown at the bottom with "errors: XXX".
 
Top