quick question about daily email reports

nclark · Nov 2, 2013

i run a fairly new installation (~2 months). unless i am misreading the literature, there is now way to simulate a S.M.A.R.T. failure. i have a raidz2 with 5x3tb seagate NAS drives. okay, so there's the intro.

when i get my daily email from freenas ("daily run output"), and it indicates that "all pools are healthy," can i take that to mean that all drives are healthy too? and none are reporting a S.M.A.R.T. failure? cause i setup S.M.A.R.T. notifications too, and i've yet to get one (presumably because no drives have failed yet).

any advice?

thanks!

-n

Yatti420 · Nov 2, 2013

With hard drive densities going up the error rate seems to increase.. SMART control is a monitoring system for the drives.. I don't believe it will accurately predict a failure.. But it will give you as much heads up as possible (usually not alot) when the drive fails.. Scrubs are important aswell running ZFS..

http://smartmontools.sourceforge.net/man/smartctl.8.html
http://doc.freenas.org/index.php/ZFS_Scrubs

Dusan · Nov 2, 2013

nclark said:
when i get my daily email from freenas ("daily run output"), and it indicates that "all pools are healthy," can i take that to mean that all drives are healthy too? and none are reporting a S.M.A.R.T. failure?

No. It is possible for ZFS to report that all pools are healty while some of the critical SMART attributes are increasing. For example, the number of reallocated sectors may be increasing, but ZFS will not notice any problem until the drive runs of out spare sectors. It is also possible for a SMART long test to fail, but if the unreadable sector is in a disk area not used by any data at the moment, ZFS will not notice ("all pools are healthy").

nclark said:
cause i setup S.M.A.R.T. notifications too, and i've yet to get one (presumably because no drives have failed yet).

You can test the SMART notification functionality by setting the Critical temperature to a low enough value (e.g. 10) in Services->S.M.A.R.T.; smartd should notify you next time it polls the drive status.

DrKK · Nov 2, 2013

Being a brand new build, I, too, have never received any SMART warning, but this train of posts has me thinking...

The notifications from the smartd poll will be sent out the same way that the daily reports are, right? i.e., it will use the same SMTP information and so on, correct? So if I'm getting the daily reports (and I am), there is in principle little need to perform the smartd notification test that you suggested?

Dusan · Nov 2, 2013

Correct. The SMTP configuration is the same, the only difference is the To: address. The daily mails use root's e-mail address, smartd will use the address defined on the S.M.A.R.T. configuration screen.

nclark · Nov 3, 2013

Yatti420 said:
With hard drive densities going up the error rate seems to increase.. SMART control is a monitoring system for the drives.. I don't believe it will accurately predict a failure.. But it will give you as much heads up as possible (usually not alot) when the drive fails.. Scrubs are important aswell running ZFS..

http://smartmontools.sourceforge.net/man/smartctl.8.html
http://doc.freenas.org/index.php/ZFS_Scrubs

thank you for the reading list. i want to be as educated about my NAS as possible.

nclark · Nov 3, 2013

Dusan said:
No. It is possible for ZFS to report that all pools are healty while some of the critical SMART attributes are increasing. For example, the number of reallocated sectors may be increasing, but ZFS will not notice any problem until the drive runs of out spare sectors. It is also possible for a SMART long test to fail, but if the unreadable sector is in a disk area not used by any data at the moment, ZFS will not notice ("all pools are healthy").

You can test the SMART notification functionality by setting the Critical temperature to a low enough value (e.g. 10) in Services->S.M.A.R.T.; smartd should notify you next time it polls the drive status.

thanks Dusan! everything i read said there was no way to simulate a S.M.A.R.T. failure. i will try what you suggested tomorrow!

nclark · Nov 5, 2013

i finally got around to testing this and it works! now i know it's setup correctly and that i'll get the SMART notifications as expected.

awesome.

cyberjock · Nov 5, 2013

nclark,

Technically its not a SMART failure. But for FreeNAS it triggers the temperature email which proves that your server is able to monitor your disks via SMART and report on problem to you via email. That pipeline is now validated to work for you, which is what is most important.

DrKK · Nov 5, 2013

Dumb question CJ, or whomever. There is an admonishment at the end of the critical temperature email that says (something like): "The system will send no further emails about this problem." I assume that this flag resets at some point, i.e., when the temperature returns to normal? And then it WILL send a further email if the condition reappears? I mean, that would only make sense. But what do I know?

cyberjock · Nov 5, 2013

DrKK said:
Dumb question CJ, or whomever. There is an admonishment at the end of the critical temperature email that says (something like): "The system will send no further emails about this problem." I assume that this flag resets at some point, i.e., when the temperature returns to normal? And then it WILL send a further email if the condition reappears? I mean, that would only make sense. But what do I know?

Correct. You get an email when you go from being below the setpoint to at or above it. It will 'reset' when it goes back below the setpoint(or on reboot obviously). If you have a setpoint that your hard drive fluctuates around you could potentially get alot of emails in a single day.

Dusan · Nov 5, 2013

DrKK said:
I assume that this flag resets at some point, i.e., when the temperature returns to normal? And then it WILL send a further email if the condition reappears? I mean, that would only make sense. But what do I know?

Yes, but you can modify the behavior if you want, by adding a -M directive to Storage->Volume->View Disks->S.M.A.R.T. extra options (you need to add it for every drive for which you want to modify the behavior). Possible options are:
[PANEL]-M once - send only one warning email for each type of disk problem detected. This is the default.
-M daily - send additional warning reminder emails, once per day, for each type of disk problem detected.
-M diminishing - send additional warning reminder emails, after a one-day interval, then a two-day interval, then a four-day interval, and so on for each type of disk problem detected. Each interval is twice as long as the previous interval.[/PANEL]
This applies to all S.M.A.R.T. emails, not just the temperatures. If a problem is no longer detected, the internal email counter is reset. If the problem reappears a new warning email is sent immediately.

cyberjock · Nov 6, 2013

I know that with 9.1.0 when my 24 drive server started to overheat I'd get 8-10 emails within a few seconds because 8-10 drives would hit the setpoint at the same time. Then 30 minutes later I'd get some more emails as a few more disks hit the threshold. So be careful where you set the setpoint. ;)

Important Announcement for the TrueNAS Community.

quick question about daily email reports

nclark

Cadet

Yatti420

Wizard

Dusan

Guru

DrKK

FreeNAS Generalissimo

Dusan

Guru

nclark

Cadet

nclark

Cadet

nclark

Cadet

cyberjock

Inactive Account

DrKK

FreeNAS Generalissimo

cyberjock

Inactive Account

Dusan

Guru

cyberjock

Inactive Account

Similar threads