CurrentPendingSector & OfflineUncorrectableSector Emails

Status
Not open for further replies.

Joshdw

Cadet
Joined
May 24, 2018
Messages
3
Hello,

first of all I'd like to thank you for taking the time in reading this before-hand, I appreciate your help.

I woke up this morning to find two emails, the first titled "CurrentPendingSector" with the contents:
Code:
Device: /dev/sda [SAT], 2 Currently unreadable (pending) sectors


The second titled "OfflineUncorrectableSector" with the contents:
Code:
Device: /dev/sda [SAT], 2 Offline uncorrectable sectors


I believe these are from S.M.A.R.T. but what I can't figure out is, what is /dev/sda? Here's a list of my drives using
Code:
camcontrol devlist
command.
Code:
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus0 target 0 lun 0 (pass0,ada0)		
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus2 target 0 lun 0 (pass1,ada1)		
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus3 target 0 lun 0 (pass2,ada2)		
<Marvell Console 1.01>			 at scbus7 target 0 lun 0 (pass3)			 
<Marvell Console 1.01>			 at scbus16 target 0 lun 0 (pass4)			
<KINGSTON SHSS37A120G SAFM02.H>	at scbus17 target 0 lun 0 (pass5,ada3)	   
<KINGSTON SHSS37A120G SAFM02.H>	at scbus18 target 0 lun 0 (pass6,ada4)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus19 target 0 lun 0 (pass7,ada5)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus20 target 0 lun 0 (pass8,ada6)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus21 target 0 lun 0 (pass9,ada7)	   
<TOSHIBA DT01ACA300 MX6OABB0>	  at scbus22 target 0 lun 0 (pass10,ada8)	 
<SanDisk Cruzer Switch 1.00>	   at scbus25 target 0 lun 0 (pass11,da0)   


ada4 is a secondary ssd I'm not currently using, I did plan to raid it with the other ssd for Freenas/Caching but didn't find the option for it when setting it up.
ada8 is a spare hdd I have in the system right now. I actually have 4 more 3TB drives I want to add to the system when I start running out of space. And I know the Toshiba drives aren't supposed to be 24/7 drives, but I can get them for about 50 euros tax included which make them the cheapest option for me for large amount of storage. I've only had 1 of them die on me (and not even died, it just has a bad sector) out of about 12 of them in 2 years. Well, apart from this one that is reporting bad sectors in my freenas system too.

How do I go about finding what has uncorrectable sectors, as from the very limit experience I have, /dev/sda is supposed to mean "first internal drive" or?

Here are some screenshots of the system:
WCFxANn.png

hiciFyA.png

wDWAjNG.png

BtxUiV6.png


Thanks a lot in advance!
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
You can view the SMART attributes/last run values for each disk from the shell or via ssh and see which disk shows you the 2 bad sectors.

Code:
smartctl -a /dev/adaX
where X is 0-8.
 
Last edited:

Joshdw

Cadet
Joined
May 24, 2018
Messages
3
You can view the SMART attributes/last run values for each disk from the shell or via ssh and see which disk shows you the 2 bad sectors.

Code:
smartctl -a /dev/adaX
where X is 0-8.
Heya! Thanks for the reply. So I ran that command on all disks and not a single one reported any relocated sectors, or errors as far as I can tell. Here's an example of one, what do I need to look for to tell which drive is failing?

xxbbdEe.png
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
You need to look at 3 attributes :
  1. Reallocated_Sector_Ct
  2. Current_Pending_Sector
  3. Offline_Uncorrectable
If the values are anything above 0, you should keep a very close eye on that drive and replace it the minute the number goes up.

Also, for the drive that you posted the screenshot for, you haven't run a long SMART test in the last 21 runs. So either you run a huge number of Short SMART tests in a relatively short time or you don't run Long SMART tests at all. Both are bad ! If all your disks are similar in SMART results, I would advise you to run long tests on each one manually and check the results. Also disable the short test jobs or at least reduce the frequency so that the long test is not affected while the long test is running.
 

Joshdw

Cadet
Joined
May 24, 2018
Messages
3
You need to look at 3 attributes :
  1. Reallocated_Sector_Ct
  2. Current_Pending_Sector
  3. Offline_Uncorrectable
If the values are anything above 0, you should keep a very close eye on that drive and replace it the minute the number goes up.

Also, for the drive that you posted the screenshot for, you haven't run a long SMART test in the last 21 runs. So either you run a huge number of Short SMART tests in a relatively short time or you don't run Long SMART tests at all. Both are bad ! If all your disks are similar in SMART results, I would advise you to run long tests on each one manually and check the results. Also disable the short test jobs or at least reduce the frequency so that the long test is not affected while the long test is running.

Awesome thanks for the reply.
Just checked all drives and none of them had any of those values above 0.

I also didn't realize that about SMART. I had it scheduled to run a short test once a day. How often do you recommend to set it at? And how often to set a long test at? (schedules)

Running a zpool scrub on all tanks right now to see if anything else is wrong. Not sure why I got those emails to be honest.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
I run short tests once a week and long tests once in 2 weeks. I schedule them so that they never run simultaneously.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I woke up this morning to find two emails, the first titled "CurrentPendingSector" with the contents:
Code:
Device: /dev/sda [SAT], 2 Currently unreadable (pending) sectors
That email didn't come from FreeNAS, as FreeNAS would never call a disk sda--it'd be either adan (as they are for you) or dan.
I run short tests once a week and long tests once in 2 weeks. I schedule them so that they never run simultaneously.
Do you mean that the long and the short tests don't run simultaneously, or that you've scheduled each disk to happen at a different time? If the latter, that's a greatly overly-complicated and fragile solution.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
Do you mean that the long and the short tests don't run simultaneously, or that you've scheduled each disk to happen at a different time? If the latter, that's a greatly overly-complicated and fragile solution.
I meant that I have scheduled the short and long so that they don't run on the same dates.
 
Status
Not open for further replies.
Top