Help please...

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Well, it is totally possible to have ZFS errors without SMART errors. It is also possible to have the vice-versa. The easy diagnosis is when you have ZFS errors and SMART errors.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep but here the error is that ada1 has failed his SMART test but there's no evidence of any problem on any drive in the SMART data (excepted the very high LCC).

Also, is it me, or have we been seeing a few cases of ZFS/FreeNAS reporting errors but SMART being fine lately?

This is the only case AFAIK.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Well, the raw read error rate tends to fluctuate. One of my Reds ended up at 2 during burn-in, later went back down to 1 and is now 2 again, I think. Never got any warnings, though, now that I think about it. Better check the SMART email.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Also, is it me, or have we been seeing a few cases of ZFS/FreeNAS reporting errors but SMART being fine lately?

I don't think it's just you. I don't see anything wrong here. Perhaps submit a bug report and reference this thread.
 

John Richardson

Explorer
Joined
May 17, 2014
Messages
60
Oh god, ada2 has a very high LCC, you should use WDIDLE3 to disable the timer ASAP (if it's not already too late), search for it on the forum ;)

Everything else is ok for every drive. I don't know why the GUI report this error.
Thanks for the heads up.

Last night I shut the server down, I started to play with that software from WD, but the program ran and just came back with SMART passed.

I'll play some more when I have some more time and update my results.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
The original alert was not for a SMART test failure, but a failure to read the log, as the OP correctly surmised:
CRITICAL: Device: /dev/ada1, Read SMART Self-Test Log Failed
Not that I have any idea how one should respond to such a failure...
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Oh, good catch. I just read the error (too) quickly and thought "yet another SMART error thread"...

So, no problem of a SMART data/GUI mismatch. Now, I don't know what this error is so I'll let a more experienced member to answer ;)
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
As usual, I'm not going to let my lack of knowledge prevent me from devising a plausible hypothesis. I imagine the following conversation:
FreeNAS to ada1: "Hey, let me see your SMART log."
ada1 to FreeNAS: "Go away, I'm busy writing the results of yet another of these incessant short self-tests you keep hitting me with."
FreeNAS to human: "ada1 won't talk to me, WTF?"
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
As usual, I'm not going to let my lack of knowledge prevent me from devising a plausible hypothesis. I imagine the following conversation:
FreeNAS to ada1: "Hey, let me see your SMART log."
ada1 to FreeNAS: "Go away, I'm busy writing the results of yet another of these incessant short self-tests you keep hitting me with."
FreeNAS to human: "ada1 won't talk to me, WTF?"

Hilarious!

But seriously, since it isn't clear what the issue is, someone's probably got to go see if this is coming from FreeNAS code or if it's inside the SMART tool code.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
And, since I suppose that's like a calculus TA telling a bunch of middle school kids they need to solve an integral,

Code:
% find /usr/ -type f -print | xargs grep "Self-Test Log Failed"
Binary file /usr/local/sbin/smartd matches


So that's a SMART tool thing. It's trying to tell us something.

https://www.smartmontools.org/ticket/89

I interpret that to mean it's saying that it's having some sort of trouble reading the SMART log on the drive.

So, just for giggles, see if you can get on the system console at a CLI and type "less /var/log/messages" and then "/Self-Test Log Failed" to search for that error. What I'm wondering is if anything is logged after it (or possibly before it). Control-B goes back a page, space goes forward a page.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Well, the raw read error rate tends to fluctuate. One of my Reds ended up at 2 during burn-in, later went back down to 1 and is now 2 again, I think. Never got any warnings, though, now that I think about it. Better check the SMART email.

Raw read error rates, as a raw number, are not monitored by smartd because they are not indicative of anything. They are not a "counter" per-se. They are a rate that can go up and down over time. smartd does monitor if you've exceeded the threshold value or not, and I can wager that 1 or 2 is not going to exceed that threshold.
 

John Richardson

Explorer
Joined
May 17, 2014
Messages
60
OK - so I used the WDIDLE3 to disable the timer on all my drives.

I rebooted and I'm still getting the errors.
Oh god, ada2 has a very high LCC, you should use WDIDLE3 to disable the timer ASAP (if it's not already too late), search for it on the forum ;)

Everything else is ok for every drive. I don't know why the GUI report this error.
OK - so I used the WDIDLE3 to disable the timer on all my drives. Thanks for the catch - 3/4 drives were set to 300, but one was set to 8.
 

John Richardson

Explorer
Joined
May 17, 2014
Messages
60
And, since I suppose that's like a calculus TA telling a bunch of middle school kids they need to solve an integral,

Code:
% find /usr/ -type f -print | xargs grep "Self-Test Log Failed"
Binary file /usr/local/sbin/smartd matches


So that's a SMART tool thing. It's trying to tell us something.

https://www.smartmontools.org/ticket/89

I interpret that to mean it's saying that it's having some sort of trouble reading the SMART log on the drive.

So, just for giggles, see if you can get on the system console at a CLI and type "less /var/log/messages" and then "/Self-Test Log Failed" to search for that error. What I'm wondering is if anything is logged after it (or possibly before it). Control-B goes back a page, space goes forward a page.
Thanks for the input - What do you mean "console at CLI"? I typed the commands into Putty, entered down to the bottom and then typed /Self... it said "no patter found" so I'm guessing it's not there.

I am seeing a ton of this though - not sure if it's related but I thought I'd post (it's showing that it's doing this every few minutes)

Code:
May 26 08:08:24 freenas smbd[54992]: [2015/05/26 08:08:24.585684,  0] ../source3/lib/util_sock.c:1199(get_remote_hostname)
May 26 08:08:24 freenas smbd[54992]:   matchname failed on 10.0.0.27
May 26 08:08:29 freenas smbd[54993]:   STATUS=daemon 'smbd' finished starting up and ready to serve connectionsmatchname: host name/name mismatch: 10.0.0.27 != (NULL)
 

John Richardson

Explorer
Joined
May 17, 2014
Messages
60
Thanks for the input - What do you mean "console at CLI"? I typed the commands into Putty, entered down to the bottom and then typed /Self... it said "no patter found" so I'm guessing it's not there.

I am seeing a ton of this though - not sure if it's related but I thought I'd post (it's showing that it's doing this every few minutes)

Code:
May 26 08:08:24 freenas smbd[54992]: [2015/05/26 08:08:24.585684,  0] ../source3/lib/util_sock.c:1199(get_remote_hostname)
May 26 08:08:24 freenas smbd[54992]:   matchname failed on 10.0.0.27
May 26 08:08:29 freenas smbd[54993]:   STATUS=daemon 'smbd' finished starting up and ready to serve connectionsmatchname: host name/name mismatch: 10.0.0.27 != (NULL)

This wasn't a problem until I reformatted the USB drive and reinstalled FreeNAS - maybe I'll try that again to see if that fixes the problem?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
CLI = command line interface so PuTTY is okay to do that ;)

The smbd messages are totally unrelated.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Hm, 11 minutes, not too shabby!

Heh.

Well, MY vote would be to acknowledge and release the warning in the GUI, and then see if it comes back. And keep an eye on it for awhile. Sometimes spurious stuff happens. It could be a firmware bug in the drive caused a temporary error trying to retrieve the log, and you'll never hit an error again because it was a one-in-a-million chance.

It is helpful to remember that the red light on the dashboard is very much akin to a "CHECK ENGINE" light on your car - it is a relatively vague thing that means one of any number of sensors has tripped, and it is not guaranteed to be a serious issue.
 

John Richardson

Explorer
Joined
May 17, 2014
Messages
60
Hey all -

Thanks for the help - I reinstalled FreeNAS to the thumb drive and ran SMART tests last night - no email alerts!

I guess I'll just watch and see.

Thanks again!
 
Status
Not open for further replies.
Top