Register for the iXsystems Community to get an ad-free experience

1 Currently unreadable (pending) sectors

Western Digital Drives - The Preferred Drives of FreeNAS and TrueNAS CORE
Status
Not open for further replies.

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
Hi all,
I have got my first HDD error. CRITICAL: Oct. 26, 2017, 11:18 a.m. - Device: /dev/da4 [SAT], 1 Currently unreadable (pending) sectors. I have run a scrub and this has repaired the data. When I run smartctl -a /dev/da4 I see that 1 sector was unreadable. The alarm status will not clear in the freenas GUI I'm assuming due to the SMART error. Is the drive still safe to use? Should I just keep an eye on it? I have 8 x 4TB drive in a RAIDZ2 all WD reds.
Code:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   464
  3 Spin_Up_Time			0x0027   183   176   021	Pre-fail  Always	   -	   7808
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   69
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0
  9 Power_On_Hours		  0x0032   092   092   000	Old_age   Always	   -	   5903
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   68
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   66
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   247
194 Temperature_Celsius	 0x0022   124   111   000	Old_age   Always	   -	   28
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   1
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline	   Completed: read failure	   90%	  5903		 112953656
 
Last edited by a moderator:

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
18,145
Yeah, it's well on the way to its maker.

A single short test and it fails? Nope, not trustworthy.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,415
Code:
# 1 Short offline	 Completed: read failure	 90%	 5903		 112953656


the smart test failing is what is important. Replace the drive.
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
I'll order a new drive and replace it. This drive is not old i'ts not even a year old. I have RED drives in this pool that are over 4 years old with no issues. I'll have to RMA it.
 
Last edited:
Joined
Apr 9, 2015
Messages
1,258
Yep it's one of those things, sometimes a bad one will creep in there. Just RMA'd a HGST NAS drive that was about two years old.
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
What does the following line mean in smartctl -a /dev/da4
Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1

This is defiantly not good
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 464
up from 0

How long does an RMA usually take?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
13,551
...and when you get the replacement burned in and installed, set up regular SMART tests for all your drives. This drive has only seen one in nearly 6000 hours of use.
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
That drive was deselected by accident in the SMART test schedule. I run the bellow schedule. Is this excessive?

index.php
 

Attachments

  • Screen Shot 2017-10-27 at 6.19.50 pm.png
    Screen Shot 2017-10-27 at 6.19.50 pm.png
    45.4 KB · Views: 314

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
13,551
I run the bellow schedule. Is this excessive?
Not at all. I run them a bit more often--short tests daily, long tests every week. I'd say your schedule is fine, just make sure to put the replacement drive on it once you get it installed, burned in, and replaced into the pool.
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
As an experiment I've taken the drive off line and am wiping the drive by filling it with zeros which should force the drive to reallocate the bad sector. I'll do some tests and if they pass I'll re-slither the drive and see what happens.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,248
Yep it's one of those things, sometimes a bad one will creep in there. Just RMA'd a HGST NAS drive that was about two years old.

Yeppers. Just had a 2yo 3TB Red Pro fail a long test. It’s only got 2 power cycles on it :)
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
I'll post updates when the wipe finishes. Probably going to take a few hours.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,415
I'll post updates when the wipe finishes. Probably going to take a few hours.
This doesn't do anything. How do you think it will fix things? People do this all the time, the sector is still going to be bad and more will go bad.

Make sure to burn in your new drive when you get it, should take about 3 extra days. I keep 2 extra drives in the shelf all the time that have already been burned in so I can replace a bad drive instantly.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
13,551
This doesn't do anything. How do you think it will fix things?
Because forcing a write to the bad sector will cause the drive to reallocate the bad sector. It will no longer be a pending sector, and the SMART test will (probably) pass, so it will look like everything's OK.
the sector is still going to be bad and more will go bad.
...but this is the problem, and why I don't trust the "force a write to the bad block" method. A single bad block wouldn't greatly bother me, but a failing SMART test is a big red flag.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,415
Because forcing a write to the bad sector will cause the drive to reallocate the bad sector. It will no longer be a pending sector, and the SMART test will (probably) pass, so it will look like everything's OK.

...but this is the problem, and why I don't trust the "force a write to the bad block" method. A single bad block wouldn't greatly bother me, but a failing SMART test is a big red flag.
Yeah that's what I don't get. Let the drive Mark it as reallocated when it normal does. Why force it? Just to make the red light in FreeNAS go away?
 
Last edited by a moderator:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
13,551

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
The next question I have is, when you get a drive replaced under warranty do drive manufactures issue you a new drive or is it just a reconditioned drive?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,415
It's remanufactured.
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
...which is trivially easy anyway--just uncheck the box.

I realize this but every time a SMART test is run a new alert is triggered. Doing this is a learning experience and I have initiated a RMA anyway.

I have ordered a new drive and will do as suggested with the warranty replacement drive, burn it in and keep at as a spare.
 

Brezlord

Contributor
Joined
Jan 7, 2017
Messages
189
Status
Not open for further replies.
Top