Critical - Unrecoverable Error

Status
Not open for further replies.

NAS-Plus

Explorer
Joined
Apr 15, 2017
Messages
73
I received an error message on one of our FreeNAS servers. The error says:
CRITICAL: May 30, 2017, 1:36 a.m. - The volume vol_xx (ZFS) state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

The only additional information that I can see is under Volume Status / Scrub. One of the 4 raidz1 drives has a Checksum value of 33. The other 3 drives all show a Checksum value of 0. All 4 of the drives show 0 Read, 0 Write, and show a Status of Online.

How serious is this? Do I need to replace the drive with Chicksum of 33? Is there anything else that I need to do?

Thanks,
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Do a smart test to the drive that shows errors and post in quotes, please.
Code tags please so the formatting doesn't get all wonky.
 

NAS-Plus

Explorer
Joined
Apr 15, 2017
Messages
73
Thanks! Will do.

I started the following check:
smartctl -t long /dev/da1

It says that it will take 355 minutes to complete.
 

NAS-Plus

Explorer
Joined
Apr 15, 2017
Messages
73
I'm still new to FreeNAS so if I did this incorrectly don't hesitate to tell me.

The da1p2 drive is the one that now has 39 in the checksum column. All the other drives have 0 in the columns.

I extracted the SMART data with the following command: smartctl -a /dev/da1 | more

Code:
Extended self-test routine   
recommended polling time:  ( 355) minutes.   
Conveyance self-test routine   
recommended polling time:  (  5) minutes.   
SCT capabilities:  (0x3035) SCT Status supported.   
  SCT Feature Control supported.   
  SCT Data Table supported.   
   
SMART Attributes Data Structure revision number: 16   
Vendor Specific SMART Attributes with Thresholds:   
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_
FAILED RAW_VALUE   
  1 Raw_Read_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -
  0   
  3 Spin_Up_Time  0x0027  189  172  021  Pre-fail  Always  -
  5541   
  4 Start_Stop_Count  0x0032  100  100  000  Old_age  Always  -
  61   
  5 Reallocated_Sector_Ct  0x0033  200  200  140  Pre-fail  Always  -
  0   
  7 Seek_Error_Rate  0x002e  200  200  000  Old_age  Always  -
  0   
  9 Power_On_Hours  0x0032  063  063  000  Old_age  Always  -
  27403   
 10 Spin_Retry_Count  0x0032  100  253  000  Old_age  Always  -
  0   
 11 Calibration_Retry_Count 0x0032  100  253  000  Old_age  Always  -
  0   
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -
  59   
192 Power-Off_Retract_Count 0x0032  200  200  000  Old_age  Always  -
  68   
193 Load_Cycle_Count  0x0032  001  001  000  Old_age  Always  -
  1357642   
194 Temperature_Celsius  0x0022  120  108  000  Old_age  Always  -
  30   
196 Reallocated_Event_Count 0x0032  200  200  000  Old_age  Always  -
  0   
197 Current_Pending_Sector  0x0032  200  200  000  Old_age  Always  -
  0   
198 Offline_Uncorrectable  0x0030  200  200  000  Old_age  Offline  -
  0   
199 UDMA_CRC_Error_Count  0x0032  200  200  000  Old_age  Always  -
  13   
200 Multi_Zone_Error_Rate  0x0008  200  200  000  Old_age  Offline  -
  1   
   
SMART Error Log Version: 1   
No Errors Logged

  0   
197 Current_Pending_Sector  0x0032  200  200  000  Old_age  Always  -
  0   
198 Offline_Uncorrectable  0x0030  200  200  000  Old_age  Offline  -
  0   
199 UDMA_CRC_Error_Count  0x0032  200  200  000  Old_age  Always  -
  13   
200 Multi_Zone_Error_Rate  0x0008  200  200  000  Old_age  Offline  -
  1   
   
SMART Error Log Version: 1   
No Errors Logged   
   
SMART Self-test log structure revision number 1   
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA
_of_first_error   
# 1  Extended offline  Completed without error  00%  27400  -  
# 2  Short offline  Completed without error  00%  27385  -  
# 3  Short offline  Completed without error  00%  27361  -  
# 4  Short offline  Completed without error  00%  27337  -  
# 5  Extended offline  Completed without error  00%  27298  -  
# 6  Short offline  Completed without error  00%  27266  -  
# 7  Short offline  Completed without error  00%  27241  -  
# 8  Short offline  Completed without error  00%  27217  -  
# 9  Short offline  Completed without error  00%  27193  -  
#10  Short offline  Completed without error  00%  27169  -  
#11  Extended offline  Completed without error  00%  27126  -  
#12  Short offline  Completed without error  00%  27099  -  
#13  Short offline  Completed without error  00%  27075  -  
#14  Short offline  Completed without error  00%  27053  -  
#15  Short offline  Completed without error  00%  27027  -  
#16  Short offline  Completed without error  00%  27002  -  
#17  Extended offline  Completed without error  00%  26971  -  
#18  Short offline  Completed without error  00%  26930  -  
#19  Short offline  Completed without error  00%  26906  -  
#20  Short offline  Completed without error  00%  26882  -  
#21  Short offline  Completed without error  00%  26859  -  
   
SMART Selective self-test log data structure revision number 1   
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS   
  1  0  0  Not_testing   
  2  0  0  Not_testing   
  3  0  0  Not_testing   
  4  0  0  Not_testing   
  5  0  0  Not_testing   
Selective self-test flags (0x0):   
  After scanning selected spans, do NOT read-scan remainder of disk.   
If Selective self-test is pending on power-up, resume after 0 minute delay
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Not the problem but load cycles are through the roof. What kind of drive is this? You need to use wdidle on it.

Sent from my Nexus 5X using Tapatalk
 

NAS-Plus

Explorer
Joined
Apr 15, 2017
Messages
73
Here is the drive information:
Code:
=== START OF INFORMATION SECTION ===											
Model Family:	 Western Digital Caviar Green (AF)							 
Device Model:	 WDC WD20EARS-00MVWB0										 
Serial Number:	WD-WCAZA1349441											   
LU WWN Device Id: 5 0014ee 20503e92b											
Firmware Version: 51.0AB51													 
User Capacity:	2,000,398,934,016 bytes [2.00 TB]							 
Sector Size:	  512 bytes logical/physical									
Device is:		In smartctl database [for details use: -P show]			   
ATA Version is:   ATA8-ACS (minor revision not indicated)					   
SATA Version is:  SATA 2.6, 3.0 Gb/s											
Local Time is:	Thu Jun  1 01:29:17 2017 CDT								 
SMART support is: Available - device has SMART capability.					 
SMART support is: Enabled													   
																				
=== START OF READ SMART DATA SECTION ===										
SMART Status not supported: Incomplete response, ATA output registers missing   
SMART overall-health self-assessment test result: PASSED						
Warning: This result is based on an Attribute check.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Read up on wdidle and green drives. You should do it to all of them.

Sent from my Nexus 5X using Tapatalk
 
Status
Not open for further replies.
Top