CurrentPendingSector & OfflineUncorrectableSector Emails

Joshdw · May 24, 2018

Hello,

first of all I'd like to thank you for taking the time in reading this before-hand, I appreciate your help.

I woke up this morning to find two emails, the first titled "CurrentPendingSector" with the contents:

Code:

Device: /dev/sda [SAT], 2 Currently unreadable (pending) sectors

The second titled "OfflineUncorrectableSector" with the contents:

Code:

Device: /dev/sda [SAT], 2 Offline uncorrectable sectors

I believe these are from S.M.A.R.T. but what I can't figure out is, what is /dev/sda? Here's a list of my drives using

Code:

camcontrol devlist

command.

Code:

<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus0 target 0 lun 0 (pass0,ada0)		
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus2 target 0 lun 0 (pass1,ada1)		
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus3 target 0 lun 0 (pass2,ada2)		
<Marvell Console 1.01>			 at scbus7 target 0 lun 0 (pass3)			 
<Marvell Console 1.01>			 at scbus16 target 0 lun 0 (pass4)			
<KINGSTON SHSS37A120G SAFM02.H>	at scbus17 target 0 lun 0 (pass5,ada3)	   
<KINGSTON SHSS37A120G SAFM02.H>	at scbus18 target 0 lun 0 (pass6,ada4)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus19 target 0 lun 0 (pass7,ada5)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus20 target 0 lun 0 (pass8,ada6)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus21 target 0 lun 0 (pass9,ada7)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus22 target 0 lun 0 (pass10,ada8)	 
<SanDisk Cruzer Switch 1.00>	   at scbus25 target 0 lun 0 (pass11,da0)

ada4 is a secondary ssd I'm not currently using, I did plan to raid it with the other ssd for Freenas/Caching but didn't find the option for it when setting it up.
ada8 is a spare hdd I have in the system right now. I actually have 4 more 3TB drives I want to add to the system when I start running out of space. And I know the Toshiba drives aren't supposed to be 24/7 drives, but I can get them for about 50 euros tax included which make them the cheapest option for me for large amount of storage. I've only had 1 of them die on me (and not even died, it just has a bad sector) out of about 12 of them in 2 years. Well, apart from this one that is reporting bad sectors in my freenas system too.

How do I go about finding what has uncorrectable sectors, as from the very limit experience I have, /dev/sda is supposed to mean "first internal drive" or?

Here are some screenshots of the system:

Thanks a lot in advance!

Inxsible · May 24, 2018

You can view the SMART attributes/last run values for each disk from the shell or via ssh and see which disk shows you the 2 bad sectors.

Code:

smartctl -a /dev/adaX

where X is 0-8.

Joshdw · May 24, 2018

Inxsible said:
You can view the SMART attributes/last run values for each disk from the shell or via ssh and see which disk shows you the 2 bad sectors.

Code:
smartctl -a /dev/adaX
where X is 0-8.

Heya! Thanks for the reply. So I ran that command on all disks and not a single one reported any relocated sectors, or errors as far as I can tell. Here's an example of one, what do I need to look for to tell which drive is failing?

Inxsible · May 24, 2018

You need to look at 3 attributes :

Reallocated_Sector_Ct
Current_Pending_Sector
Offline_Uncorrectable

If the values are anything above 0, you should keep a very close eye on that drive and replace it the minute the number goes up.

Also, for the drive that you posted the screenshot for, you haven't run a long SMART test in the last 21 runs. So either you run a huge number of Short SMART tests in a relatively short time or you don't run Long SMART tests at all. Both are bad ! If all your disks are similar in SMART results, I would advise you to run long tests on each one manually and check the results. Also disable the short test jobs or at least reduce the frequency so that the long test is not affected while the long test is running.

Joshdw · May 24, 2018

Inxsible said:
You need to look at 3 attributes :

Reallocated_Sector_Ct

Current_Pending_Sector

Offline_Uncorrectable

If the values are anything above 0, you should keep a very close eye on that drive and replace it the minute the number goes up.

Also, for the drive that you posted the screenshot for, you haven't run a long SMART test in the last 21 runs. So either you run a huge number of Short SMART tests in a relatively short time or you don't run Long SMART tests at all. Both are bad ! If all your disks are similar in SMART results, I would advise you to run long tests on each one manually and check the results. Also disable the short test jobs or at least reduce the frequency so that the long test is not affected while the long test is running.

Awesome thanks for the reply.
Just checked all drives and none of them had any of those values above 0.

I also didn't realize that about SMART. I had it scheduled to run a short test once a day. How often do you recommend to set it at? And how often to set a long test at? (schedules)

Running a zpool scrub on all tanks right now to see if anything else is wrong. Not sure why I got those emails to be honest.

Inxsible · May 24, 2018

I run short tests once a week and long tests once in 2 weeks. I schedule them so that they never run simultaneously.

danb35 · May 25, 2018

Joshdw said:
I woke up this morning to find two emails, the first titled "CurrentPendingSector" with the contents:

Code:
Device: /dev/sda [SAT], 2 Currently unreadable (pending) sectors

That email didn't come from FreeNAS, as FreeNAS would never call a disk sda--it'd be either adan (as they are for you) or dan.

Inxsible said:
I run short tests once a week and long tests once in 2 weeks. I schedule them so that they never run simultaneously.

Do you mean that the long and the short tests don't run simultaneously, or that you've scheduled each disk to happen at a different time? If the latter, that's a greatly overly-complicated and fragile solution.

Inxsible · May 25, 2018

danb35 said:
Do you mean that the long and the short tests don't run simultaneously, or that you've scheduled each disk to happen at a different time? If the latter, that's a greatly overly-complicated and fragile solution.

I meant that I have scheduled the short and long so that they don't run on the same dates.

Important Announcement for the TrueNAS Community.

CurrentPendingSector & OfflineUncorrectableSector Emails

Joshdw

Cadet

Inxsible

Guru

Joshdw

Cadet

Inxsible

Guru

Joshdw

Cadet

Inxsible

Guru

danb35

Hall of Famer

Inxsible

Guru

Similar threads

Important Announcement for the TrueNAS Community.

CurrentPendingSector & OfflineUncorrectableSector Emails

Cadet

Guru

Cadet

Guru

Cadet

Guru

Hall of Famer

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "CurrentPendingSector & OfflineUncorrectableSector Emails"

Similar threads