Offline unreadable/uncorrectable sectors

Status
Not open for further replies.

adrianwi

Guru
Joined
Oct 15, 2013
Messages
1,231
Received an e-mail from my system about this last night, but when I checked the pool everything was showing as online.

Starting to see this message in the console now:

Code:
May 19 08:15:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Currently unreadable (pending) sectors
May 19 08:15:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Offline uncorrectable sectors
May 19 08:15:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Currently unreadable (pending) sectors
May 19 08:15:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Offline uncorrectable sectors
May 19 08:45:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Currently unreadable (pending) sectors
May 19 08:45:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Offline uncorrectable sectors
May 19 08:45:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Currently unreadable (pending) sectors
May 19 08:45:36 freenas1 smartd[2894]: Device: /dev/da6 [SAT], 8 Offline uncorrectable sectors


This drive has only been running for 830 hours. Should I be looking to replace it?
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
@Lox is probably right, but the experts will want to see the full smartctl output for the drive. In the meantime, you might want to run a scrub.
 

adrianwi

Guru
Joined
Oct 15, 2013
Messages
1,231
Having scanned through the other threads, here's the output of smart -x on the drive in question. Also checked the Seagate website and this is in warranty until 2017 so guess I just need to understand whether it's worth trying to get a RMA now or later.

http://pastebin.com/JfpaUm42
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Having scanned through the other threads, here's the output of smart -x on the drive in question.
I think what I would do is just keep an eye on the SMART output for now. If #197 and #198 keep growing, or #5 starts racking up, then I'd be concerned. And if any short or long self-test fails, I'd RMA it.

Hopefully someone more knowledgeable can chime in.
 

russnas

Contributor
Joined
May 31, 2013
Messages
113
Isn't it signs of a failing drive?
I had the same issue, 16 unreadable and uncorrectable, 18 CRC error count
With a Seagate 4tb, non nas, just over a year old.
I backed up and checked the sectors in windows, 3 bad blocks, replacement took about 3 weeks

Try this app too

www.seagate.com/au/en/support/downloads/item/seatools-win-master/
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Isn't it signs of a failing drive?
I had the same issue, 16 unreadable and uncorrectable, 18 CRC error count
Maybe, but the OP has a sensible SMART testing schedule in place, email notification working correctly, and a RAIDZ2 pool. It's unlikely anything catastrophic will happen, even if the disk stops spinning tomorrow. And unless I'm mistaken, it isn't "the same issue," since there are no CRC errors.
 

adrianwi

Guru
Joined
Oct 15, 2013
Messages
1,231
Interestingly, the console messages have stopped and the e-mail SMART report I received this morning is not showing these errors anymore:

Code:
########## SMART status report for da6 drive (Seagate NAS HDD: S300YW83) ##########

SMART overall-health self-assessment test result: PASSED

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   115   099   006    Pre-fail  Always       -       92456672
  3 Spin_Up_Time            0x0003   092   092   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       18
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   052   052   030    Pre-fail  Always       -       846144062547
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       1200
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       18
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   072   059   045    Old_age   Always       -       28 (Min/Max 27/33)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       18
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       18
194 Temperature_Celsius     0x0022   028   041   000    Old_age   Always       -       28 (0 19 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

No Errors Logged

Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
Extended offline    Interrupted (host reset)      20%      1181         -
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I don't understand how #s 197 and 198 went from 8 to 0 without anything showing up in #5. Probably my understanding is flawed.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
If I ignore #198 then it's possible because when you have a pending sector it'll not be remapped until there's a write and the sector is still not readable after the write (to avoid remapping a sector if it's just a hiccup). Now either #198 isn't what I think or there's some black magic happening...

Edit: ok, after reading the #198 attribute description on the wiki article it's ok, no black magic... The uncorrectable sectors are the sum of the pending sectors (not readable sectors) and the "not writable sectors". So if the pending sectors attribute is decremented then this one is too.
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
The uncorrectable sectors are the sum of the pending sectors (not readable sectors) and the "not writable sectors". So if the pending sectors attribute is decremented then this one is too.
This implies that a sector can change from not readable to readable, which only seems possible (according to the description of #s 197 and 198) if a sector that had a read error was later successfully overwritten.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yes, exactly.
  • 1. read error
  • 2. pending sector++
  • 3. write
  • 4. verification with a read
  • 4.a if failed --> pending sector--, reallocated sector++, sector remapped
  • 4.b if successful --> pending sector--, sector not remapped
 
Status
Not open for further replies.
Top