smartd indicates failure not reflected in attributes

Keno5net · Oct 31, 2013

I am getting the following errors after expanding my mirrored volume by replacing two 1 tb disks with two 2 tb disks.

Oct 31 17:14:22 freenas smartd[1905]: Device: /dev/ada1, Failed SMART usage Attribute: 5 Reallocated_Sector_Ct.

This is only showing up on one disk. Here are the reallocated sector count attributes read from the disks.

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

This is from the ada0 not causing errors
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0

This is from ada1 which is erroring
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0

I have run the short test on ada1 and it passed I am in the process of running the long test. I think this may be a false positive or it could be a problem with the new disk. I have a replacement coming tomorrow from a different vendor and manufacturer. Is it common for SMARTD to report errors that aren't reflected in smartctl results? I have seen at least one other post with similar disparity.

The problem disks are both Segate NAS HDD's Model ST2000VN000

cyberjock · Oct 31, 2013

Ｕｓｕａｌｌｙ　ｉｔｓ　ｕｓｅｒ　ｅｒｒｏｒ　ｉｎ　ｕｎｄｅｒｓｔａｎｄｉｎｇ　ｗｈａｔ　ｓｍａｒｔｃｔｌ　－ａ　ｇｉｖｅ　ｙｏｕ　ｆｏｒ　ｔｈｅ　ｏｕｔｐｕｔ．　Can you post the entire output of smartctl -a -q noserial /dev/ada1?

And for some reason the font keep changing and it won't let me change it back. 　Ｓｏ　ｓｏｒｒｙ　ｆｏｒ　ｔｈｅ　ｗｅｉｒｄ　ｆｏｎｔｓ．　　ＷＴＦ．．．．．

Keno5net · Oct 31, 2013

Here is the -a -q output in a file. As you can see I am still waiting on the long test to complete.

Keno5net · Oct 31, 2013

The long smart test completed without error.

cyberjock · Oct 31, 2013

I'm at a loss to explain your problem. The disk appears to be in perfect health. Drive is nice and cool, low load cycle count, etc. I can't explain it.

Keno5net · Oct 31, 2013

It appears that there may not be a problem with the drive. I decided to try the old Dogbert tech support maxim "shut up and reboot" and there have been no failures for over an hour. Now that I think about it when I updated the volume I hot swapped out the second disk which may have gotten the error stuck in the system somehow. /shrug.

One last question.. I have a WD red disk arriving tomorrow and have read in several places the good chance of two disks from the same manufacturer and batch failing within hours of one another. Would I be better off replacing one of my Seagate disks with the new WD disk to protect against that possibility? Thanks for your time.

cyberjock · Oct 31, 2013

Keno5net said:
It appears that there may not be a problem with the drive. I decided to try the old Dogbert tech support maxim "shut up and reboot" and there have been no failures for over an hour. Now that I think about it when I updated the volume I hot swapped out the second disk which may have gotten the error stuck in the system somehow. /shrug.

One last question.. I have a WD red disk arriving tomorrow and have read in several places the good chance of two disks from the same manufacturer and batch failing within hours of one another. Would I be better off replacing one of my Samsung disks with the new WD disk to protect against that possability? Thanks for your time.

I don't worry about such things to be honest. The whole issue with failures within hours is because in theory all of your disks in the pool will "wear out" at a similar rate. If they were manufactured at the same time they will likely have the same "time to failure". I don't worry about it too much myself. Just watching how many people lose their pools you're far more likely to lose your pool due to improper design, administration, and maintenance of your server.

Don't take this the wrong way, but money says if we had a conversation over skype I'd find at least 2 things you've probably done wrong with your server that puts you at higher risk for data loss than it should otherwise be. It's just a fact that a lot of people don't follow the FreeNAS manual's recommendations, read and follow the stickies, and actually take a lot of the advice from the forums to heart. It's one thing to read it, its another to follow it.

Also keep in mind that RAID is not a substitute for backups. If you really want to worry about things like 2 disks from the same batch failing within hours of each other you should be running a small backup server to maintain your most important data. Even a cheaper slower system can make an excellent backup server. You don't need 100MB/sec transfer rates to/from your backup server, right?

Keno5net · Oct 31, 2013

I am already running a smaller backup server that I run once a week to let it back up the main system on Sunday night. Maybe I will use the new disk in that so the sizes match. I decided to do a cold backup system after hearing about the latest ransom-ware that has been going around. Not that I plan to practice unsafe computing but better safe than sorry. The main NAS is used for backups of two systems and as a media server to a home theater. Now I can back that up once a week with rsync to the second server the only thing I still need to work on is off site backup but for home use that probably won't happen unless I haul a disk to and from work once a month.

Thanks again..

Important Announcement for the TrueNAS Community.

smartd indicates failure not reflected in attributes

Keno5net

Cadet

cyberjock

Inactive Account

Keno5net

Cadet

Attachments

Keno5net

Cadet

cyberjock

Inactive Account

Keno5net

Cadet

cyberjock

Inactive Account

Keno5net

Cadet

Similar threads

Important Announcement for the TrueNAS Community.

smartd indicates failure not reflected in attributes

Cadet

Inactive Account

Cadet

Attachments

Cadet

Inactive Account

Cadet

Inactive Account

Cadet

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "smartd indicates failure not reflected in attributes"

Similar threads