Error when unlocking pool

Status
Not open for further replies.

warriorcookie

Explorer
Joined
Apr 17, 2017
Messages
67
So FreeNAS sent me an error saying the administrator had taken DA5 offline and the pool was degraded. I'm the only admin, and I definitly didn't do that....

So I looked at the disk listing and DA5 was missing. I removed it from it's bay and re-inserted and no change. I moved it down to an empty bay, refreshed the list of disks by the web interface stopped responding. So I reboot, all disks are now showing, but when I try unlocking the pool I get the following error in one big run on string:

Code:
Environment: Software Version: FreeNAS-11.0-U2 (e417d8aa5) Request Method: POST Request URL: http://freenas.bailey.com/storage/volume/2/unlock/?X-Progress-ID=e9a87134-7833-4168-886b-ae68e9363a3c Traceback: File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py" in inner 39. response = get_response(request) File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _legacy_get_response 249. response = self._get_response(request) File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response 178. response = middleware_method(request, callback, callback_args, callback_kwargs) File "./freenasUI/freeadmin/middleware.py" in process_view 162. return login_required(view_func)(request, *view_args, **view_kwargs) File "/usr/local/lib/python3.6/site-packages/django/contrib/auth/decorators.py" in _wrapped_view 23. return view_func(request, *args, **kwargs) File "./freenasUI/storage/views.py" in volume_unlock 1190. form.done(volume=volume) File "./freenasUI/storage/forms.py" in done 2616. raise MiddlewareError(msg) Exception Type: MiddlewareError at /storage/volume/2/unlock/ Exception Value: [MiddlewareError: b'Volume could not be imported']


So I shutdown, put the DA5 drive back in its original bay. Startup, unlock and now the error has changed:
Code:
Environment: Software Version: FreeNAS-11.0-U2 (e417d8aa5) Request Method: POST Request URL: http://freenas.bailey.com/storage/volume/2/unlock/?X-Progress-ID=f0e7ee20-d55f-4286-8191-f9ae2cf25849 Traceback: File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py" in inner 39. response = get_response(request) File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _legacy_get_response 249. response = self._get_response(request) File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response 178. response = middleware_method(request, callback, callback_args, callback_kwargs) File "./freenasUI/freeadmin/middleware.py" in process_view 162. return login_required(view_func)(request, *view_args, **view_kwargs) File "/usr/local/lib/python3.6/site-packages/django/contrib/auth/decorators.py" in _wrapped_view 23. return view_func(request, *args, **kwargs) File "./freenasUI/storage/views.py" in volume_unlock 1190. form.done(volume=volume) File "./freenasUI/storage/forms.py" in done 2616. raise MiddlewareError(msg) Exception Type: MiddlewareError at /storage/volume/2/unlock/ Exception Value: [MiddlewareError: b'Volume could not be imported: 2 devices failed to decrypt']


I've tried unlocking with the key file and it gives the same error.


Not sure what happened, or what I did wrong.



What would you suggest I do next? Should I try detaching them import?
 

warriorcookie

Explorer
Joined
Apr 17, 2017
Messages
67
Down the rabbit hole.

I detached then imported the pool successfully but.....

Now the error: The volume Data1 state is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.

zpool status -v:
Code:
pool: Data1																													  
state: ONLINE																													
status: One or more devices has experienced an error resulting in data															
		corruption.  Applications may be affected.																				
action: Restore the file in question if possible.  Otherwise restore the															
		entire pool from backup.																									
   see: http://illumos.org/msg/ZFS-8000-8A																						
  scan: resilvered 19.2M in 0h0m with 0 errors on Tue Aug 15 17:59:57 2017														
config:																															
																																	
		NAME												STATE	 READ WRITE CKSUM											
		Data1											   ONLINE	   0	 0	 0											
		  mirror-0										  ONLINE	   0	 0	 0											
			gptid/9e2ab870-4fea-11e7-a521-000c29306b8a.eli  ONLINE	   0	 0	 0											
			gptid/f67f8879-47fe-11e7-b036-000c29306b8a.eli  ONLINE	   0	 0	 0											
		  mirror-1										  ONLINE	   0	 0	 0											
			gptid/31ad87a6-7f7f-11e7-894e-000c29306b8a.eli  ONLINE	   0	 0	 1											
			gptid/70c0b404-7f7b-11e7-894e-000c29306b8a.eli  ONLINE	   0	 0	 0											
																																	
errors: Permanent errors have been detected in the following files:																
																																	
		Data1/Jails/plexmediaserver_1:<0x0>																						
		/mnt/Data1/Jails/plexmediaserver_1/var/db/plexdata/Plex Media Server/Logs/Plex DLNA Server.log							
		/mnt/Data1/Jails/plexmediaserver_1/var/db/plexdata/Plex Media Server/Preferences.xml										
		/mnt/Data1/Jails/plexmediaserver_1/var/db/plexdata/Plex Media Server														
		Data1/.system/rrd-7c35bc62b22f460fb3766e1c156d5c44:<0x0>																	
		/var/db/system/rrd-7c35bc62b22f460fb3766e1c156d5c44/freenas.bailey.com/aggregation-cpu-sum/cpu-system.rrd				  
																																	
  pool: freenas-boot																												
state: ONLINE																													
  scan: scrub repaired 0 in 0h0m with 0 errors on Mon Jul 24 03:45:26 2017														
config:																															
																																	
		NAME		STATE	 READ WRITE CKSUM																					
		freenas-boot  ONLINE	   0	 0	 0																					
		  da0p2	 ONLINE	   0	 0	 0																					
																																	
errors: No known data errors						  



And smartctl on each drive:
Drive 1 (WD RED 3TB):
Code:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   185   181   021	Pre-fail  Always	   -	   5708										
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   13										  
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   098   098   000	Old_age   Always	   -	   1810										
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   13										  
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   11										  
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   177										
194 Temperature_Celsius	 0x0022   119   113   000	Old_age   Always	   -	   31										  
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   0											
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Short offline	   Completed without error	   00%	   242		 -													
# 2  Short offline	   Completed without error	   00%	   194		 -													
# 3  Extended offline	Completed without error	   00%	   178		 -													
# 4  Short offline	   Completed without error	   00%	   146		 -													
# 5  Extended offline	Completed without error	   00%		43		 -													
# 6  Conveyance offline  Completed without error	   00%		 0		 -													
# 7  Short offline	   Completed without error	   00%		 0		 -		  



WD RED 3TB
Code:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   183   178   021	Pre-fail  Always	   -	   5833										
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   13										  
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   098   098   000	Old_age   Always	   -	   1810										
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   13										  
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   11										  
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   186										
194 Temperature_Celsius	 0x0022   121   117   000	Old_age   Always	   -	   29										  
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   0											
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Short offline	   Completed without error	   00%	   242		 -													
# 2  Short offline	   Completed without error	   00%	   194		 -													
# 3  Extended offline	Completed without error	   00%	   179		 -													
# 4  Short offline	   Completed without error	   00%	   146		 -													
# 5  Extended offline	Aborted by host			   10%		45		 -													
# 6  Conveyance offline  Completed without error	   00%		 0		 -													
# 7  Short offline	   Completed without error	   00%		 0		 -													



WD RED 1TB
Code:
Vendor Specific SMART Attributes with Thresholds:																				  
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   140   136   021	Pre-fail  Always	   -	   4000										
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   74										  
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   064   064   000	Old_age   Always	   -	   26568										
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   74										  
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   73										  
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   0											
194 Temperature_Celsius	 0x0022   115   109   000	Old_age   Always	   -	   28										  
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   100   253   000	Old_age   Offline	  -	   0											
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
No self-tests have been logged.  [To run self-tests, use: smartctl -t]



WD RED 1TB
Code:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   178   136   021	Pre-fail  Always	   -	   2091										
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   74										  
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   200   200   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   064   064   000	Old_age   Always	   -	   26569										
10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   74										  
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   73										  
193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   0											
194 Temperature_Celsius	 0x0022   115   109   000	Old_age   Always	   -	   28										  
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   100   253   000	Old_age   Offline	  -	   0											
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
No self-tests have been logged.  [To run self-tests, use: smartctl -t] 
 
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,367
You should configure smart testing... haven't tested your drives in 1600 hours.

Try running a scrub over the pool. Sometimes those corruption errors do go away after a scrub.

Other times they do not. Looks to me like nothing super critical got corrupted though. Just some system logs, your plex log, and your plex prefs.
 

warriorcookie

Explorer
Joined
Apr 17, 2017
Messages
67
You should configure smart testing... haven't tested your drives in 1600 hours.

Try running a scrub over the pool. Sometimes those corruption errors do go away after a scrub.

Other times they do not. Looks to me like nothing super critical got corrupted though. Just some system logs, your plex log, and your plex prefs.

Yeah, I have smart testing configured, but they haven't been installed long enough to trigger. These were pulled from my old Poweredge 2900 that were being run on hardware raid perc5i. Couldn't do smart testing on that card. You can see Drive1 and Drive2 both have had several smart tests run now.

I ran smartctl -t long on the drive with the checksum error, and it tested good. So, cleared the errors and all has been fine since. I redid the passphrase, key, and recovery for the pool just for good measure and we are mounting just fine again.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
You don't seem to have very many drives. Can they not be attached directly to mobo SATA ports?
 

warriorcookie

Explorer
Joined
Apr 17, 2017
Messages
67
You don't seem to have very many drives. Can they not be attached directly to mobo SATA ports?

I'm running a hypervisor (ESXI 6.5)
Enabling passthrough for the sata ports is an all or nothing deal for this mobo. That would leave me with nothing for the rest of my VM's.

That and I've got 2 more drives I'll be bringing online right away
 
Status
Not open for further replies.
Top