S.M,A.R.T. reporting/dashboard errors? Need info

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
Hi, I have a few questions.

I have a pool that doesn't show any errored drives in the summary on the dashboard (pool status online, used space, disks with errors, etc):

1692673431684.png



but when I click on the pool status icon to the right, it opens up and shows a bang next to health:


1692673509201.png


That page shows several extended tests that failed (I've attached a smartctl output of the drive below).

Also the Alerts button in the top right ofthe gui reports a thumbnail of the problem:

1692673799653.png


My question is why doesn't the dashboard page show an error warning? Going into the reports, and looking at the shell run of the smartctl, it seems like there are, but are these 'harmless' smart errors (is there any such thing?) and 16 uncorrectable errors doesn't seem harmless. Is there a configuration that I should set - an error threshold maybe, to show this as an error in the dashboard?

I've started a return on the drive, but I'm wondering about why it didn't show on the dashboard.

Thanks for any advice/pointers!
 

Attachments

  • sdg error.txt
    20.3 KB · Views: 59

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
My question is why doesn't the dashboard page show an error warning?
The pool status is shown (correctly) as healthy because none of the data stored in it is compromised.

Individual disks within your pool may be having issues that indicate they are in the process of failing (but are not yet dead), but as long as ZFS can get to the needed data on them (for now) nothing is wrong.

What you should do is pay attention to the early warning signs of failure of that drive which are confirmed by the SMART report and consider replacing that disk (or at least watching that error count very closely and acting fast if it goes up further)
 

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
So the 'Disk with Errors' field will only show when a disk actually fails/becomes so corrupt the data isn't redundantly safeguarded. And the errors the SMART report is showing (the unreadable sectors) show it's heading towards failure. I'm glad I didn't wait to start the RMA (it's still under warranty), lol.

Thank for the answer and information!
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
"Disks with Errors" is underneath pool and perhaps is worded differently than I might. It's talking about errors detected reading or writing (and checksum) to the pool. It's merely reporting zfs detected errors. zfs does not measure other drive characteristics. Smart data measures a whole variety of data, and also self tests. A given sector can be bad, but maybe zfs never tried to read or write from it. Perhaps a self test found it. Just an example, but the point is they are measuring different things. Drives can correct errors at times too, and if they do, perhaps zfs never knows. Smart does things like temperature monitoring and many other attributes. This can slow down the system so it's generally scheduled.

You should have your email alerts setup to email you when smart tests fail. I want to know immediately, not have to check manually.
 

eyocum

Dabbler
Joined
Mar 16, 2015
Messages
16
Sorry for the late reply. Thanks for the information, I've made changes to my alerts emails based on yours and sretalla's posts.
 
Top