Changing HDD Order/Currently unreadable (pending) sectors

Status
Not open for further replies.

Gamer0126

Dabbler
Joined
Mar 24, 2015
Messages
25
My specific question is will my storage become corrupt if I change the order of my hard drives?

I have the following Hardware configuration:
Supermicro MDB-X10SL7-F-O with Intel Core i3-4160 and 2*8GB (16GB) Crucial ECC DDR3 1600MHz
8 * WD60EFRX WD Red 6TB in RAIDZ and a Kingston USB 64GB Thumbdrive (Boot)
FreeNAS -9.10.1-U2

I received the following error:

Code:
Oct. 30, 2016, 8:17 a.m. - Device: /dev/da5 [SAT], 1 Currently unreadable (pending) sectors 


So I ran a long SMART self-test and I did get the following error:

Code:
Oct. 30, 2016, 1:32 p.m. - Device: /dev/da5 [SAT], Self-Test Log error count increased from 0 to 1 


And I ran the script to investigate the state of the drive but for the current pending sectors raw value is zero which confuses me because all the cases I have read about has the Raw Value with 1, 2 or some value.

Code:
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed: read failure	   90%	   817		 135986864

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 


I followed a Hard Drive Troubleshooting Guide (changed the all cables out), but I have not been able to pin point the issue except for the hard drive. However the SMART output does not point to the Hard Drive being an issue rather a cable or some other inter-connect. I am wondering if changing the order of the Hard Drives then running a SMART test to see if the issue follows the Hard Drive would tell me more but I worry about the data that is currently being stored on my machine. Which is what brought me to my question if my data would be corrupt if I reorder the hard drives?
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
My specific question is will my storage become corrupt if I change the order of my hard drives?
No. Literally no semi-usable RAID solution has that problem. Not enterprise Hardware RAID controllers, not ZFS, not even Intel fakeRAID.
However the SMART output does not point to the Hard Drive being an issue rather a cable or some other inter-connect.
You should probably let us be the judge of that. Fortunately, the tiny bit you did post is enough to say that the hard drive is failing:
Code:
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error 

# 1 Extended offline Completed: read failure 90% 817 135986864
The drive is failing SMART tests. Replace it. It's well within the infant mortality period, so I'm left with the impression that you didn't burn in your drives.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
My specific question is will my storage become corrupt if I change the order of my hard drives?
Simple Answer is No. It does not matter with ZFS. But given the fact that the SMART Long test failed, there is no need to move your drives around, this is not a communications problem, it's strictly an internal problem to the drive.

It appears that you have a mechanical issue. Please post the entire output of smartctl -a /dev/da5.

Also, it is possible that a Pending Sector can be recovered so a change of 1 to 0 is possible. So it looks like you had an issue and it was corrected. However when you ran the Long test you discovered that you do have reading issues with your drive.

Are you running a RAIDZ1 (aka RAIDZ) or RAIDZ2?

The drive looks like an RMA is in it's future.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Supermicro MDB-X10SL7-F-O with Intel Core i3-6300 and 2*16GB Crucial ECC DDR3 1600MHz
Also, none of that makes sense:
  • An X10SL7-F has no chance in hell of accepting an i3-6300
  • The i3-6300 is almost always used with DDR4, because why wouldn't you
  • 16GB DDR3 UDIMMs are expensive as hell - if they're DDR4, they don't fit in the X10SL7-F
 

Gamer0126

Dabbler
Joined
Mar 24, 2015
Messages
25
Thanks for the feedback. Here is the output of the smartctl -a /dev/da5 shown below.

In regards to the burn in test I ran the tests here: https://forums.freenas.org/index.php?threads/how-to-hard-drive-burn-in-testing.21451/
step by step. Since I am new to this I may have made a mistake, I am currently in the middle of moving my files so I still have a backup copy of all the files on the current FreeNAS system so I can run through the tests again if need be and based on some feedback sounds like I should at least on da5. How can I tell if I was successful on the other drives?

Thanks again for all the help.

Code:
Conveyance self-test routine																										
recommended polling time:		(   5) minutes.																					
SCT capabilities:			  (0x303d) SCT Status supported.																	  
										SCT Error Recovery Control supported.													  
										SCT Feature Control supported.															
										SCT Data Table supported.																  
																																	
SMART Attributes Data Structure revision number: 16																				
Vendor Specific SMART Attributes with Thresholds:																				  
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   196   191   051	Pre-fail  Always	   -	   202										
  3 Spin_Up_Time			0x0027   197   197   021	Pre-fail  Always	   -	   9116										
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   19										  
  5 Reallocated_Sector_Ct   0x0033   179   179   140	Pre-fail  Always	   -	   630										
  7 Seek_Error_Rate		 0x002e   188   182   000	Old_age   Always	   -	   273										
  9 Power_On_Hours		  0x0032   099   099   000	Old_age   Always	   -	   990										
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   19										  
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   17										  
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   16										  
194 Temperature_Celsius	 0x0022   103   098   000	Old_age   Always	   -	   49										  
196 Reallocated_Event_Count 0x0032   176   176   000	Old_age   Always	   -	   24										  
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   52										  
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed: read failure	   90%	   961		 105232104											
# 2  Extended offline	Completed: read failure	   90%	   956		 109150000											
# 3  Extended offline	Completed: read failure	   90%	   817		 135986864											
																																	
SMART Selective self-test log data structure revision number 1																	
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay. 
 
Joined
Jul 10, 2016
Messages
521
Based on the pre-fail indicators below, I would replace this drive:
Code:
  5 Reallocated_Sector_Ct   0x0033   179   179   140	Pre-fail  Always		-	   630												
196 Reallocated_Event_Count 0x0032   176   176   000	Old_age   Always		-	   24										
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   52	

Also: The temperature seems high. Not sure if that is a consequence of the drive going bad, or a contributing factor. I would recommend checking the temps of your other drives to make sure you don't have an airflow/cooling issue.
Code:
194 Temperature_Celsius	0x0022  103  098  000   Old_age  Always	-	49 
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
Also: The temperature seems high. Not sure if that is a consequence of the drive going bad, or a contributing factor. I would recommend checking the temps of your other drives to make sure you don't have an airflow/cooling issue.

Agree... or the drive is about to die or there is not enough cooling, wish, if that's the case, will kill the other ones.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
OUCH! Hot Drive and Very Bad Drive! Fix your cooling. Also you might as well post the smart -a results for your other drives, or at least look at them and ensure nothing else is going on.
 

Gamer0126

Dabbler
Joined
Mar 24, 2015
Messages
25
I have looked at the other drives and the temps for those are around 37, I am assuming this is an okay temp. The other drive is still around 48 guess I will definitely RMA this drive. Is there anything specific I should look for on the other drives which would tell me if I made the same mistake of not completing a burn in test, specifically in the smartctl output?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Take a look at the Hard Drive Troubleshooting Guide (see link below). It will help you out with items to look for.
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
Take a look at the Hard Drive Troubleshooting Guide (see link below). It will help you out with items to look for.

Thanks for directing us to it! Yesterday I was reading smartmontools to catch-up on that.
 
Status
Not open for further replies.
Top