Hard Drive Troubleshooting - Massive Failures - Need Help Isolating the Problem(s)

Status
Not open for further replies.

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
sas2flash -listall will show you everything about your hba.

Also I think all of your drives are dead except for da6. You need to run smart test automatically and you don't have to understand anything about the smart output. You will be emailed if something fails or is bad, then you can post the error message if it doesn't make sense.

Freenas is very hands off, after setting up emails, ups, scrubs, smart tests and snapshots I have not touched mine in years. I'll do an upgrade every now and then and i get a smart email about drive temps in the summer but that is about the only things i have done.

You also need to read every link in my signature before using freenas again.


that is a lot of guides, I will sure them :) (some I already read before actually)

I did read some guide earlier today and some other sites online regarding SMART values.

I checked the following values for all the drives in the troubled pool, SMART data on first page of this thread. Others SMART values are of no interest as I understand:
ID 5 Reallocated_Sector_Ct seems ok
ID 197 Current_Pending_Sector seems ok
ID 198 Offline_Uncorrectable seems ok

ID 199 UDMA_CRC_Error_Count above zero for several disks and really high for one drive. da7, the one dying.
According to that guide, those drives should be ok. But you are telling me they are all bad.
Is there a specific value you looking at when writing they are all bad ?

Because after those ok value above i started suspecting something else,
The PSU or the M1015
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
that is a lot of guides, I will sure them :) (some I already read before actually)

I did read some guide earlier today and some other sites online regarding SMART values.

I checked the following values for all the drives in the troubled pool, SMART data on first page of this thread. Others SMART values are of no interest as I understand:
ID 5 Reallocated_Sector_Ct seems ok
ID 197 Current_Pending_Sector seems ok
ID 198 Offline_Uncorrectable seems ok

ID 199 UDMA_CRC_Error_Count above zero for several disks and really high for one drive. da7, the one dying.
According to that guide, those drives should be ok. But you are telling me they are all bad.
Is there a specific value you looking at when writing they are all bad ?

Because after those ok value above i started suspecting something else,
The PSU or the M1015
Stop trying to figure it out and just run a smart extended test. It's super simple either it passes or it doesn't
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
Stop trying to figure it out and just run a smart extended test. It's super simple either it passes or it doesn't
ok you mean if it passes then no need to read those values ? a general rule for future too ?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
ok you mean if it passes then no need to read those values ? a general rule for future too ?
Yes, completely ignore those values. Just run a smart test that is what they are for. Make your life easy by just letting the tool tell you if something is broken. Don't interpret the results for yourself.
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
dmesg might help. But perhaps someone who actually uses this HBA could tell us the current driver and firmware and how to tell which firmware you've got?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I gave the command already and the version is 20.00.07.00 I think, 20.00.04.00 might also be ok.
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
Using now FreeNAS-11.0-U4 (54848d13b)

Code:
[root@freenas ~]# sas2flash -listall																								
LSI Corporation SAS2 Flash Utility																								
Version 16.00.00.00 (2013.03.01)																									
Copyright (c) 2008-2013 LSI Corporation. All rights reserved																		
																																	
		Adapter Selected is a LSI SAS: SAS2308_1(D1)																				
																																	
Num   Ctlr			FW Ver		NVDATA		x86-BIOS		 PCI Addr														
----------------------------------------------------------------------------														
																																	
0  SAS2308_1(D1)   20.00.07.00	14.01.30.16	07.39.02.00	 00:02:00:00														
1  SAS2008(B2)	 20.00.07.00	14.01.00.08	  No Image	  00:07:00:00														
																																	
		Finished Processing Commands Successfully.																				
		Exiting SAS2Flash.


Smart long test results coming later :)
 
Last edited by a moderator:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
You need to have regular snapshots set up in case crap hits the fan.
It's something to consider, but not nearly as hard a requirement as the others.
You ABSOLUTELY must set up the email in FreeNAS so you can be informed of the various things that FreeNAS is doing.
...and also set it up for the SMART service. They're separate configurations.
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
OK guys I am back. It was not a furenal for any lost data, but i had issues with my PC machine instead and same time i finished backing up the failing pool.
So all data is safe now.

Now some updates.
The pool did resilver and finish resilvering, then restarted several times as someone mentioned in the thread earlier.
here is current status of the pool:
Code:
root@freenas ~]# zpool status -v Secondary_Raidz3								    											  
  pool: Secondary_Raidz3												  														  
state: DEGRADED														  														  
status: One or more devices could not be opened.  Sufficient replicas exist for													 
  	   the pool to continue functioning in a degraded state.			  														  
action: Attach the missing device and online it using 'zpool online'.		    													
   see: http://illumos.org/msg/ZFS-8000-2Q							    														  
  scan: resilvered 42.7G in 51h55m with 0 errors on Sun Oct  8 22:12:03 2017														
config:  																		 													
  																		 														  
  	   NAME											STATE	 READ WRITE CKSUM												 
  	   Secondary_Raidz3								DEGRADED	 0     0	 0												 
  		 raidz3-0									  DEGRADED	 0     0	 0												 
  		   17620392916775898278						UNAVAIL	  0     0	 0  was /dev/gptid/8275e396-a83c-11e7-9cee-002590f5b
804  																		 														
  		   gptid/3a44142c-931c-11e7-b895-002590f5b804  ONLINE	   0     0	 0												 
  		   gptid/33c047e7-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  		   gptid/34749735-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  		   6370505857967419013						 OFFLINE	  0     0	 0  was /dev/gptid/3536bf51-2292-11e7-9626-002590f5b
804  																		 														
  		   gptid/35e2d6ec-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  		   gptid/368b679d-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  		   gptid/3730ee56-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  		   gptid/37de7e53-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  		   replacing-9								 UNAVAIL	  0    61	 0												 
  			 5660221525628801207					   UNAVAIL	  0     0	 0  was /dev/da8p2								 
  			 da0p2									 FAULTED	  0 6.77K	 0  too many errors								 
  		   gptid/39778368-2292-11e7-9626-002590f5b804  ONLINE	   0     0	 0												 
  																		 														  
errors: No known data errors																					 


I did long tests on all drives as someone else request, here are the results for the drives:

Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Seagate NAS HDD																								  
Device Model:	 ST4000VN000-1H4168																								
Serial Number:	Z301NHXV																										
LU WWN Device Id: 5 000c50 066dce71a																								
Firmware Version: SC44																											
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5900 rpm																										
Form Factor:	  3.5 inches																										
Device is:		In smartctl database [for details use: -P show]																  
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b																			  
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:05:43 2017 CEST																					
SMART support is: Available - device has SMART capability.																		
SMART support is: Enabled																										  
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											  
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													
Self-test execution status:	  (  39) The self-test routine was interrupted													  
										by the host with a hard or soft reset.													
Total time to complete Offline																									
data collection:				(  107) seconds.																					
Offline data collection																											
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														
										command.																					
										Offline surface scan supported.															
										Self-test supported.																		
										Conveyance Self-test supported.															
										Selective Self-test supported.															
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		
										Supports SMART auto save timer.															
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														
Short self-test routine																											
recommended polling time:		(   1) minutes.																					
Extended self-test routine																										
recommended polling time:		( 508) minutes.								
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x10bd) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   117   099   006	Pre-fail  Always	   -	   127788264									
  3 Spin_Up_Time			0x0003   093   091   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   262										 
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   087   060   030	Pre-fail  Always	   -	   642979137									
  9 Power_On_Hours		  0x0032   074   074   000	Old_age   Always	   -	   23251										
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   242										 
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always	   -	   5											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   071   055   045	Old_age   Always	   -	   29 (Min/Max 25/32)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   76										   
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   257										 
194 Temperature_Celsius	 0x0022   029   045   000	Old_age   Always	   -	   29 (0 18 0 0 0)							 
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   4											
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Interrupted (host reset)	  70%	 23240		 -													 
# 2  Extended offline	Completed without error	   00%	  9424		 -													 
# 3  Short offline	   Completed without error	   00%	  9410		 -													 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.			

Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Seagate NAS HDD																								   
Device Model:	 ST4000VN000-1H4168																								
Serial Number:	S300VSWF																										 
LU WWN Device Id: 5 000c50 0753d5db1																								
Firmware Version: SC44																											 
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5900 rpm																										 
Form Factor:	  3.5 inches																										
Device is:		In smartctl database [for details use: -P show]																   
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b																			   
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:12:58 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											   
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													 
Self-test execution status:	  (  39) The self-test routine was interrupted													   
										by the host with a hard or soft reset.													 
Total time to complete Offline																									 
data collection:				(  128) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   1) minutes.																					
Extended self-test routine																										 
recommended polling time:		( 532) minutes.	 
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x10bd) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   118   099   006	Pre-fail  Always	   -	   172554120									
  3 Spin_Up_Time			0x0003   093   092   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   258										 
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   087   060   030	Pre-fail  Always	   -	   641236446									
  9 Power_On_Hours		  0x0032   074   074   000	Old_age   Always	   -	   23251										
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   240										 
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   099   000	Old_age   Always	   -	   5											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   071   053   045	Old_age   Always	   -	   29 (Min/Max 25/32)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   78										   
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   258										 
194 Temperature_Celsius	 0x0022   029   047   000	Old_age   Always	   -	   29 (0 18 0 0 0)							 
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   1											
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Interrupted (host reset)	  00%	 23239		 -													 
# 2  Extended offline	Completed without error	   00%	  9433		 -													 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.

Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Seagate NAS HDD																								   
Device Model:	 ST4000VN000-1H4168																								
Serial Number:	S3019679																										 
LU WWN Device Id: 5 000c50 0802a8a02																								
Firmware Version: SC46																											 
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5900 rpm																										 
Form Factor:	  3.5 inches																										
Device is:		In smartctl database [for details use: -P show]																   
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b																			   
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:14:22 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											   
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													 
Self-test execution status:	  (  39) The self-test routine was interrupted													   
										by the host with a hard or soft reset.													 
Total time to complete Offline																									 
data collection:				(  107) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   1) minutes.																					
Extended self-test routine																										 
recommended polling time:		( 485) minutes.														 
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x10bd) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   108   099   006	Pre-fail  Always	   -	   19314632									 
  3 Spin_Up_Time			0x0003   092   091   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   122										 
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   085   060   030	Pre-fail  Always	   -	   391527898									
  9 Power_On_Hours		  0x0032   085   085   000	Old_age   Always	   -	   13734										
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   103										 
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   001   000	Old_age   Always	   -	   472453766984								 
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   072   064   045	Old_age   Always	   -	   28 (Min/Max 24/31)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   88										   
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   128										 
194 Temperature_Celsius	 0x0022   028   040   000	Old_age   Always	   -	   28 (0 17 0 0 0)							 
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   190   000	Old_age   Always	   -	   63498										
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Interrupted (host reset)	  00%	 13723		 -													 
# 2  Extended offline	Completed without error	   00%	 10326		 -													 
# 3  Short offline	   Completed without error	   00%	 10308		 -													 
# 4  Extended offline	Completed without error	   00%	   236		 -													 
# 5  Short offline	   Completed without error	   00%	   229		 -													 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
@arameen do you have a habit of interrupting smart tests? All three drives show that the latest SMART test was interupted.
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Seagate NAS HDD																									
Device Model:	 ST4000VN000-1H4168																								
Serial Number:	S3019PGW																										 
LU WWN Device Id: 5 000c50 0804fa3a4																								
Firmware Version: SC46																											 
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5900 rpm																										 
Form Factor:	  3.5 inches																										
Device is:		In smartctl database [for details use: -P show]																	
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b																				
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:15:33 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																											
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																												
Offline data collection status:  (0x82) Offline data collection activity															
										 was completed without error.																
										 Auto Offline Data Collection: Enabled.													
Self-test execution status:	  (   0) The previous self-test routine completed													
										 without error or no self-test has ever													
										 been run.																					
Total time to complete Offline																									 
data collection:				(  107) seconds.																					
Offline data collection																											 
capabilities:					 (0x7b) SMART execute Offline immediate.															
										 Auto Offline data collection on/off support.												
										 Suspend Offline collection upon new														
										 command.																					
										 Offline surface scan supported.															 
										 Self-test supported.																		
										 Conveyance Self-test supported.															 
										 Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										 power-saving mode.																		 
										 Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										 General Purpose Logging supported.														
Short self-test routine																											 
recommended polling time:		(   1) minutes.																					
Extended self-test routine																			
recommended polling time:		( 501) minutes.																					
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x10bd) SCT Status supported.																		
										 SCT Error Recovery Control supported.													  
										 SCT Feature Control supported.															 
										 SCT Data Table supported.																	
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																					
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   119   099   006	Pre-fail  Always		-	   206266648									
  3 Spin_Up_Time			0x0003   093   092   000	Pre-fail  Always		-	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always		-	   174										
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always		-	   0											
  7 Seek_Error_Rate		 0x000f   087   060   030	Pre-fail  Always		-	   499427988									
  9 Power_On_Hours		  0x0032   079   079   000	Old_age   Always		-	   18480										
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   165										
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always		-	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always		-	   0											
188 Command_Timeout		 0x0032   100   099   000	Old_age   Always		-	   1											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always		-	   0											
190 Airflow_Temperature_Cel 0x0022   069   058   045	Old_age   Always		-	   31 (Min/Max 27/34)						  
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always		-	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always		-	   60										  
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always		-	   180										
194 Temperature_Celsius	 0x0022   031   042   000	Old_age   Always		-	   31 (0 19 0 0 0)							
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always		-	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always		-	   5											
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed without error	   00%	 18480		  -													
# 2  Extended offline	Completed without error	   00%	  4677		  -													
# 3  Short offline	   Completed without error	   00%	  4663		  -													
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.

Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Seagate NAS HDD																								   
Device Model:	 ST4000VN000-1H4168																								
Serial Number:	S3019PAC																										 
LU WWN Device Id: 5 000c50 0804fa622																								
Firmware Version: SC46																											 
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5900 rpm																										 
Form Factor:	  3.5 inches																										
Device is:		In smartctl database [for details use: -P show]																   
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b																			   
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:21:39 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											   
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													 
Self-test execution status:	  (   0) The previous self-test routine completed													
										without error or no self-test has ever													 
										been run.																				   
Total time to complete Offline																									 
data collection:				(  117) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   1) minutes.																					
Extended self-test routine																	 
recommended polling time:		( 500) minutes.																					
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x10bd) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   120   099   006	Pre-fail  Always	   -	   1558960									 
  3 Spin_Up_Time			0x0003   091   091   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   098   098   020	Old_age   Always	   -	   2214										 
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   085   060   030	Pre-fail  Always	   -	   384219232									
  9 Power_On_Hours		  0x0032   085   085   000	Old_age   Always	   -	   13266										
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   116										 
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   099   000	Old_age   Always	   -	   4295032833								   
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   072   061   045	Old_age   Always	   -	   28 (Min/Max 24/31)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0											
192 Power-Off_Retract_Count 0x0032   099   099   000	Old_age   Always	   -	   2213										 
193 Load_Cycle_Count		0x0032   099   099   000	Old_age   Always	   -	   2216										 
194 Temperature_Celsius	 0x0022   028   040   000	Old_age   Always	   -	   28 (0 16 0 0 0)							 
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   4											
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Completed without error	   00%	 13266		 -													 
# 2  Short offline	   Completed without error	   00%	  9194		 -													 
# 3  Extended offline	Completed without error	   00%	  8946		 -													 
# 4  Short offline	   Completed without error	   00%	  8937		 -													 
# 5  Extended offline	Completed without error	   00%	  8737		 -													 
# 6  Short offline	   Completed without error	   00%	  8729		 -													 
# 7  Extended offline	Completed without error	   00%	   483		 -													 
# 8  Short offline	   Completed without error	   00%	   475		 -													 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.   

Code:
=== START OF INFORMATION SECTION ===																								
Device Model:	 ST4000VN008-2DR166																								
Serial Number:	WDH304RQ																										 
LU WWN Device Id: 5 000c50 0aac831c5																								
Firmware Version: SC60																											 
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5980 rpm																										 
Form Factor:	  3.5 inches																										
Device is:		Not in smartctl database [for details use: -P showall]															
ATA Version is:   ACS-3 T13/2161-D revision 5																					   
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:22:34 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											   
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													 
Self-test execution status:	  (   0) The previous self-test routine completed													
										without error or no self-test has ever													 
										been run.																				   
Total time to complete Offline																									 
data collection:				(  591) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   1) minutes.																					
Extended self-test routine																										 
recommended polling time:		( 645) minutes.											 
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x50bd) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   079   067   044	Pre-fail  Always	   -	   88696432									 
  3 Spin_Up_Time			0x0003   093   093   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   8											
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   072   060   045	Pre-fail  Always	   -	   17698599									 
  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   184 (84 230 0)							   
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   8											
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   097   097   000	Old_age   Always	   -	   4295032860								   
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   071   069   040	Old_age   Always	   -	   29 (Min/Max 24/31)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   0											
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   21										   
194 Temperature_Celsius	 0x0022   029   040   000	Old_age   Always	   -	   29 (0 24 0 0 0)							 
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   0											
240 Head_Flying_Hours	   0x0000   100   253   000	Old_age   Offline	  -	   133 (105 201 0)							 
241 Total_LBAs_Written	  0x0000   100   253   000	Old_age   Offline	  -	   6838456988								   
242 Total_LBAs_Read		 0x0000   100   253   000	Old_age   Offline	  -	   1365619070								   
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
No self-tests have been logged.  [To run self-tests, use: smartctl -t]															 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay. 
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
I suggest you try zpool online Secondary_Raidz3 6370505857967419013 to see if you can get redundancy restored.

Did the computer reboot itself to interrupt the SMART long tests, or did you reboot it?
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
@arameen ada3 and ada4 do not have any interruptions. But SMART has never been run on ada5. Looks like it's a very new drive. You should fix that. Have you enabled SMART short and long tests yet ?
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
Code:
=== START OF INFORMATION SECTION ===									    														
Device Model:	 ST4000VN008-2DR166										    													
Serial Number:	ZDH217CP													  													
LU WWN Device Id: 5 000c50 0a49ce7e6									    														
Firmware Version: SC60														    												  
User Capacity:	4,000,787,030,016 bytes [4.00 TB]						  														
Sector Sizes:	 512 bytes logical, 4096 bytes physical					    													
Rotation Rate:	5980 rpm													    												  
Form Factor:	  3.5 inches												  													  
Device is:		Not in smartctl database [for details use: -P showall]	    													
ATA Version is:   ACS-3 T13/2161-D revision 5							    														
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)					  													  
Local Time is:	Wed Oct 11 23:24:18 2017 CEST							    													  
SMART support is: Available - device has SMART capability.				    													  
SMART support is: Enabled													  													  
  																		 														  
=== START OF READ SMART DATA SECTION ===								    														
SMART overall-health self-assessment test result: PASSED					  													  
  																		 														  
General SMART Values:														    													
Offline data collection status:  (0x00) Offline data collection activity	    													
  									   was never started.			    														  
  									   Auto Offline Data Collection: Disabled.													 
Self-test execution status:	  (   0) The previous self-test routine completed  												  
  									   without error or no self-test has ever													 
  									   been run.						  														  
Total time to complete Offline											    													  
data collection:				(  591) seconds.							  													  
Offline data collection														  													
capabilities:  				   (0x73) SMART execute Offline immediate.			    											
  									   Auto Offline data collection on/off support.												
  									   Suspend Offline collection upon new														 
  									   command.						  														  
  									   No Offline surface scan supported.														 
  									   Self-test supported.			  														  
  									   Conveyance Self-test supported.    														  
  									   Selective Self-test supported.    														  
SMART capabilities:			(0x0003) Saves SMART data before entering	  													  
  									   power-saving mode.			    														  
  									   Supports SMART auto save timer.    														  
Error logging capability:		(0x01) Error logging supported.			  													  
  									   General Purpose Logging supported.														 
Short self-test routine													    													  
recommended polling time:		(   1) minutes.								    												
Extended self-test routine													    												  
recommended polling time:		( 657) minutes.															
Conveyance self-test routine													    												
recommended polling time:		(   2) minutes.								    												
SCT capabilities:			  (0x50bd) SCT Status supported.			    														
  									   SCT Error Recovery Control supported.													   
  									   SCT Feature Control supported.    														  
  									   SCT Data Table supported.		  														  
  																		 														  
SMART Attributes Data Structure revision number: 10						    													  
Vendor Specific SMART Attributes with Thresholds:							  													  
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   082   067   044	Pre-fail  Always  	  -	   163449898									
  3 Spin_Up_Time			0x0003   098   096   000	Pre-fail  Always  	  -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always  	  -	   24										   
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always  	  -	   0											
  7 Seek_Error_Rate		 0x000f   068   060   045	Pre-fail  Always  	  -	   7030325									 
  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always  	  -	   134 (146 145 0)							 
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   7											
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always    	-	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always    	-	   0											
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always    	-	   0											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always    	-	   0											
190 Airflow_Temperature_Cel 0x0022   075   069   040	Old_age   Always    	-	   25 (Min/Max 20/31)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always    	-	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always    	-	   23										   
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always    	-	   1529										 
194 Temperature_Celsius	 0x0022   025   040   000	Old_age   Always    	-	   25 (0 20 0 0 0)							 
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always    	-	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline      -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always    	-	   0											
240 Head_Flying_Hours	   0x0000   100   253   000	Old_age   Offline      -	   130 (134 40 0)							   
241 Total_LBAs_Written	  0x0000   100   253   000	Old_age   Offline      -	   163065165									
242 Total_LBAs_Read		 0x0000   100   253   000	Old_age   Offline      -	   384733									   
  																		 														  
SMART Error Log Version: 1												    													  
No Errors Logged														    														
  																		 														  
SMART Self-test log structure revision number 1							    													  
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Completed without error	   00%	   119	  	-													 
  																		 														  
SMART Selective self-test log data structure revision number 1			    													  
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS							  														  
    1		0		0  Not_testing									  														  
    2		0		0  Not_testing									  														  
    3		0		0  Not_testing									  														  
    4		0		0  Not_testing									  														  
    5		0		0  Not_testing									  														  
Selective self-test flags (0x0):												  												  
  After scanning selected spans, do NOT read-scan remainder of disk.	  														  
If Selective self-test is pending on power-up, resume after 0 minute delay. 

Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Western Digital Red																							   
Device Model:	 WDC WD40EFRX-68WT0N0																							 
Serial Number:	WD-WCC4E0020880																								   
LU WWN Device Id: 5 0014ee 25e559406																								
Firmware Version: 80.00A80																										 
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5400 rpm																										 
Device is:		In smartctl database [for details use: -P show]																   
ATA Version is:   ACS-2 (minor revision not indicated)																			 
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:27:03 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											   
Offline data collection status:  (0x00) Offline data collection activity															
										was never started.																		 
										Auto Offline Data Collection: Disabled.													 
Self-test execution status:	  (  33) The self-test routine was interrupted													   
										by the host with a hard or soft reset.													 
Total time to complete Offline																									 
data collection:				(55440) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   2) minutes.																					
Extended self-test routine																										 
recommended polling time:		( 554) minutes.																					
Conveyance self-test routine																					
recommended polling time:		(   5) minutes.																					
SCT capabilities:			  (0x703d) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 16																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x002f   200   200   051	Pre-fail  Always	   -	   0											
  3 Spin_Up_Time			0x0027   170   170   021	Pre-fail  Always	   -	   8458										 
  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   247										 
  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x002e   100   253   000	Old_age   Always	   -	   0											
  9 Power_On_Hours		  0x0032   069   069   000	Old_age   Always	   -	   23206										
10 Spin_Retry_Count		0x0032   100   100   000	Old_age   Always	   -	   0											
11 Calibration_Retry_Count 0x0032   100   100   000	Old_age   Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   244										 
192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   105										 
193 Load_Cycle_Count		0x0032   199   199   000	Old_age   Always	   -	   3592										 
194 Temperature_Celsius	 0x0022   124   108   000	Old_age   Always	   -	   28										   
196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0											
197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0											
200 Multi_Zone_Error_Rate   0x0008   200   200   000	Old_age   Offline	  -	   0											
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Interrupted (host reset)	  10%	 23194		 -													 
# 2  Extended offline	Completed without error	   00%	 22764		 -													 
# 3  Short offline	   Completed without error	   00%	 22754		 -													 
# 4  Short offline	   Completed without error	   00%	  8596		 -													 
# 5  Extended offline	Completed without error	   00%	  6926		 -													 
																																	
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.									 

Code:
=== START OF INFORMATION SECTION ===																								
Device Model:	 ST8000VN0022-2EL112																							   
Serial Number:	ZA161YG1																										 
LU WWN Device Id: 5 000c50 0a1ca94f6																								
Firmware Version: SC61																											 
User Capacity:	8,001,563,222,016 bytes [8.00 TB]																				 
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	7200 rpm																										 
Form Factor:	  3.5 inches																										
Device is:		Not in smartctl database [for details use: -P showall]															
ATA Version is:   ACS-3 T13/2161-D revision 5																					   
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:27:52 2017 CEST																					 
SMART support is: Available - device has SMART capability.																		 
SMART support is: Enabled																										   
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: FAILED!																		   
Drive failure expected in less than 24 hours. SAVE ALL DATA.																		
See vendor-specific Attribute list for failed Attributes.																		   
																																	
General SMART Values:																											   
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													 
Self-test execution status:	  (  73) The previous self-test completed having													 
										a test element that failed and the test													 
										element that failed is not known.														   
Total time to complete Offline																									 
data collection:				(  567) seconds.																					
Offline data collection																											 
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														 
										command.																					
										Offline surface scan supported.															 
										Self-test supported.																		
										Conveyance Self-test supported.															 
										Selective Self-test supported.															 
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		 
										Supports SMART auto save timer.															 
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														 
Short self-test routine																											 
recommended polling time:		(   1) minutes.										 
Extended self-test routine																										 
recommended polling time:		( 718) minutes.																					
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x50bd) SCT Status supported.																	   
										SCT Error Recovery Control supported.													   
										SCT Feature Control supported.															 
										SCT Data Table supported.																   
																																	
SMART Attributes Data Structure revision number: 10																				 
Vendor Specific SMART Attributes with Thresholds:																				   
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   081   064   044	Pre-fail  Always	   -	   140385552									
  3 Spin_Up_Time			0x0003   085   085   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   30										   
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   082   060   045	Pre-fail  Always	   -	   159013861									
  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   822 (198 36 0)							   
10 Spin_Retry_Count		0x0013   090   090   097	Pre-fail  Always   FAILING_NOW 0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   31										   
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always	   -	   0											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   070   057   040	Old_age   Always	   -	   30 (Min/Max 26/38)						   
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   212										 
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   23										   
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   1185										 
194 Temperature_Celsius	 0x0022   030   043   000	Old_age   Always	   -	   30 (0 26 0 0 0)							 
195 Hardware_ECC_Recovered  0x001a   003   001   000	Old_age   Always	   -	   140385552									
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   1											
240 Head_Flying_Hours	   0x0000   100   253   000	Old_age   Offline	  -	   805 (94 190 0)							   
241 Total_LBAs_Written	  0x0000   100   253   000	Old_age   Offline	  -	   7294319420								   
242 Total_LBAs_Read		 0x0000   100   253   000	Old_age   Offline	  -	   110015273104								 
																																	
SMART Error Log Version: 1																										 
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					 
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									 
# 1  Extended offline	Completed: unknown failure	90%	   799		 0													 
# 2  Extended offline	Completed without error	   00%	   385		 -													 
# 3  Short offline	   Completed without error	   00%	   372		 -													 
																																																		 
SMART Selective self-test log data structure revision number 1																	 
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.				 
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
Code:
=== START OF INFORMATION SECTION ===																								
Model Family:	 Seagate NAS HDD																								  
Device Model:	 ST4000VN000-1H4168																								
Serial Number:	W30124AF																										
LU WWN Device Id: 5 000c50 08ffe78a9																								
Firmware Version: SC46																											
User Capacity:	4,000,787,030,016 bytes [4.00 TB]																				
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	5900 rpm																										
Form Factor:	  3.5 inches																										
Device is:		In smartctl database [for details use: -P show]																  
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b																			  
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:30:24 2017 CEST																					
SMART support is: Available - device has SMART capability.																		
SMART support is: Enabled																										  
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											  
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													
Self-test execution status:	  (   0) The previous self-test routine completed													
										without error or no self-test has ever													
										been run.																				  
Total time to complete Offline																									
data collection:				(  107) seconds.																					
Offline data collection																											
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														
										command.																					
										Offline surface scan supported.															
										Self-test supported.																		
										Conveyance Self-test supported.															
										Selective Self-test supported.															
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		
										Supports SMART auto save timer.															
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														
Short self-test routine																											
recommended polling time:		(   1) minutes.																					
Extended self-test routine																  
recommended polling time:		( 509) minutes.																					
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x10bd) SCT Status supported.																	  
										SCT Error Recovery Control supported.													  
										SCT Feature Control supported.															
										SCT Data Table supported.																  
																																	
SMART Attributes Data Structure revision number: 10																				
Vendor Specific SMART Attributes with Thresholds:																				  
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   114   099   006	Pre-fail  Always	   -	   79165448									
  3 Spin_Up_Time			0x0003   091   091   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   134										
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   081   060   030	Pre-fail  Always	   -	   142071295									
  9 Power_On_Hours		  0x0032   092   092   000	Old_age   Always	   -	   7387										
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   120										
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always	   -	   0											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   070   063   045	Old_age   Always	   -	   30 (Min/Max 26/35)						  
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0											
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   106										
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   145										
194 Temperature_Celsius	 0x0022   030   040   000	Old_age   Always	   -	   30 (0 20 0 0 0)							
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   3											
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Completed without error	   00%	  7387		 -													
# 2  Extended offline	Completed without error	   00%	  6935		 -													
# 3  Short offline	   Completed without error	   00%	  6927		 -													
# 4  Extended offline	Interrupted (host reset)	  00%	  5722		 -													
# 5  Extended offline	Interrupted (host reset)	  00%	  5713		 -													
# 6  Extended offline	Interrupted (host reset)	  00%	  5701		 -													
# 7  Extended offline	Interrupted (host reset)	  00%	  5683		 -													
# 8  Extended offline	Interrupted (host reset)	  00%	  5667		 -													
# 9  Short offline	   Completed without error	   00%	  5666		 -													
																																																			
SMART Selective self-test log data structure revision number 1																	
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.			

Code:
=== START OF INFORMATION SECTION ===																								
Device Model:	 ST8000VN0022-2EL112																							  
Serial Number:	ZA158SD6																										
LU WWN Device Id: 5 000c50 09350e589																								
Firmware Version: SC61																											
User Capacity:	8,001,563,222,016 bytes [8.00 TB]																				
Sector Sizes:	 512 bytes logical, 4096 bytes physical																			
Rotation Rate:	7200 rpm																										
Form Factor:	  3.5 inches																										
Device is:		Not in smartctl database [for details use: -P showall]															
ATA Version is:   ACS-3 T13/2161-D revision 5																					  
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)																			
Local Time is:	Wed Oct 11 23:31:16 2017 CEST																					
SMART support is: Available - device has SMART capability.																		
SMART support is: Enabled																										  
																																	
=== START OF READ SMART DATA SECTION ===																							
SMART overall-health self-assessment test result: PASSED																			
																																	
General SMART Values:																											  
Offline data collection status:  (0x82) Offline data collection activity															
										was completed without error.																
										Auto Offline Data Collection: Enabled.													
Self-test execution status:	  (   0) The previous self-test routine completed													
										without error or no self-test has ever													
										been run.																				  
Total time to complete Offline																									
data collection:				(  567) seconds.																					
Offline data collection																											
capabilities:					(0x7b) SMART execute Offline immediate.															
										Auto Offline data collection on/off support.												
										Suspend Offline collection upon new														
										command.																					
										Offline surface scan supported.															
										Self-test supported.																		
										Conveyance Self-test supported.															
										Selective Self-test supported.															
SMART capabilities:			(0x0003) Saves SMART data before entering															
										power-saving mode.																		
										Supports SMART auto save timer.															
Error logging capability:		(0x01) Error logging supported.																	
										General Purpose Logging supported.														
Short self-test routine																											
recommended polling time:		(   1) minutes.																					
Extended self-test routine																										
recommended polling time:		( 730) minutes.										
Conveyance self-test routine																										
recommended polling time:		(   2) minutes.																					
SCT capabilities:			  (0x50bd) SCT Status supported.																	  
										SCT Error Recovery Control supported.													  
										SCT Feature Control supported.															
										SCT Data Table supported.																  
																																	
SMART Attributes Data Structure revision number: 10																				
Vendor Specific SMART Attributes with Thresholds:																				  
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE									
  1 Raw_Read_Error_Rate	 0x000f   100   064   044	Pre-fail  Always	   -	   164758									  
  3 Spin_Up_Time			0x0003   096   084   000	Pre-fail  Always	   -	   0											
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   195										
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0											
  7 Seek_Error_Rate		 0x000f   078   060   045	Pre-fail  Always	   -	   63078997									
  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   423 (2 187 0)								
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0											
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   203										
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0											
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0											
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always	   -	   0											
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0											
190 Airflow_Temperature_Cel 0x0022   069   049   040	Old_age   Always	   -	   31 (Min/Max 26/31)						  
191 G-Sense_Error_Rate	  0x0032   099   099   000	Old_age   Always	   -	   2393										
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   147										
193 Load_Cycle_Count		0x0032   099   099   000	Old_age   Always	   -	   2673										
194 Temperature_Celsius	 0x0022   031   051   000	Old_age   Always	   -	   31 (0 23 0 0 0)							
195 Hardware_ECC_Recovered  0x001a   100   001   000	Old_age   Always	   -	   164758									  
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0											
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0											
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   0											
240 Head_Flying_Hours	   0x0000   100   253   000	Old_age   Offline	  -	   379 (203 7 0)								
241 Total_LBAs_Written	  0x0000   100   253   000	Old_age   Offline	  -	   33911085117								
242 Total_LBAs_Read		 0x0000   100   253   000	Old_age   Offline	  -	   30375106372								
																																	
SMART Error Log Version: 1																										
No Errors Logged																													
																																	
SMART Self-test log structure revision number 1																					
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error									
# 1  Extended offline	Interrupted (host reset)	  00%	   400		 -													
# 2  Extended offline	Interrupted (host reset)	  00%	   233		 -													
# 3  Short offline	   Completed without error	   00%	   232		 -													
																																	
SMART Selective self-test log data structure revision number 1																	
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS																						
	1		0		0  Not_testing																								
	2		0		0  Not_testing																								
	3		0		0  Not_testing																								
	4		0		0  Not_testing																								
	5		0		0  Not_testing																								
Selective self-test flags (0x0):																									
  After scanning selected spans, do NOT read-scan remainder of disk.																
If Selective self-test is pending on power-up, resume after 0 minute delay.		
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
@arameen da1 has been interrupted again. da2 has had some "unknown failure". You should run SMART on all your drives and re-check the output. da4 has been interrupted too.
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
I suggest you try zpool online Secondary_Raidz3 6370505857967419013 to see if you can get redundancy restored.

Did the computer reboot itself to interrupt the SMART long tests, or did you reboot it?

zpool online Secondary_Raidz3 6370505857967419013
and this was the output seconds ago
Code:
warning: device '6370505857967419013' onlined, but remains in faulted state														
use 'zpool replace' to replace devices that are no longer present


and this is how the pool looks now:
Code:
																																	
		NAME											STATE	 READ WRITE CKSUM												 
		Secondary_Raidz3								DEGRADED	 0	 0	 0												 
		  raidz3-0									  DEGRADED	 0	 0	 0												 
			gptid/8275e396-a83c-11e7-9cee-002590f5b804  ONLINE	   0	 0	 0  (resilvering)								   
			gptid/3a44142c-931c-11e7-b895-002590f5b804  ONLINE	   0	 0	 0												 
			gptid/33c047e7-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0												 
			gptid/34749735-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0												 
			6370505857967419013						 UNAVAIL	  0	 0	 0  was /dev/gptid/3536bf51-2292-11e7-9626-002590f5b
804																																 
			gptid/35e2d6ec-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0												 
			gptid/368b679d-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0												 
			gptid/3730ee56-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0												 
			gptid/37de7e53-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0												 
			replacing-9								 DEGRADED	 0	61	 0												 
			  5660221525628801207					   UNAVAIL	  0	 0	 0  was /dev/da8p2								 
			  da0p2									 ONLINE	   0 6.77K	 0  (resilvering)								   
			gptid/39778368-2292-11e7-9626-002590f5b804  ONLINE	   0	 0	 0		   
 

arameen

Contributor
Joined
Sep 4, 2014
Messages
145
@arameen da1 has been interrupted again. da2 has had some "unknown failure". You should run SMART on all your drives and re-check the output. da4 has been interrupted too.

I did execute long test on all drives more with start 2 days ago. Have not interrupted anything and don't know how to interrupt or what would interrupt a test?
@inxsible Do you suggest running smart tests again :) ?

I would like to mention that one of my boot USBs is failing, tried to replace it but FreeNAS gives me different faultmessages. Not sure if it got something to do with this https://forums.freenas.org/index.php?threads/unable-to-gpt-format-the-disk.49362/
I am mentioning this because FreeNAS may have renamed the drives once I inserted a new USB trying to replaced the failing one.
I had several times issues with FreeNAS not able tho showing drive in the GUI from time to time. Just a theory, maybe just irrelevant.
But I can assure that I did execute long test on all drives I found in the GUI :confused:
if the machine did reboot, does not seem so. uptime is 5 days and i started tests less than 2 days ago. so something else is interrupting some tests :confused:
 
Last edited by a moderator:

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
so something else is interrupting some tests
You need to figure out what that something is.

Drives going to sleep. Some setting which puts the machine to sleep/hibernate/power save mode after a certain period of time. Check everything, BIOS settings, controller card settings etc.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
That is very difficult given the umpteen number of options that are available in every server. For eg. Some server boards have IPMI, some don't. Some have multi-processors, some don't. Some have multiple FreeNAS systems using replication, some don't. So on and so forth...

But the basic guideline (once the server is built and is up and running) is pretty clear.
  1. You need to have SMART running in regular intervals (I run short every day, long every 3 days)
  2. You need to have regular scrubs of your pool. (My boot pool is every 5 days, tank is every 12 days)
  3. You need to have regular snapshots set up in case crap hits the fan.
  4. You ABSOLUTELY must set up the email in FreeNAS so you can be informed of the various things that FreeNAS is doing.
  5. Set up ssh, so you have access to the box in case the GUI doesn't work for some reason
Once all that is done, you probably won't even need to login to the GUI or into the freenas box.

Generally agree, except I think you're running smart/longs too often :)
 
Status
Not open for further replies.
Top