SOLVED Needing some assistance in troubleshooting hard drive burn in/possible RMA

Status
Not open for further replies.

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
So, to preface, I have 6x6TB WD Red PRO drives that I bought from ebay and have been trying to do some burn in tests (following this guide) before I integrate them into any kind of zpool and begin putting some data on them.

I've done the smartctl short and long tests and all the drives passed without a problem. The issues begins when I am trying to run badblocks to make sure everything is clear (then running the smartctl long test again but as I got stumped on the bad blocks step, I'm hesitant to proceed).

First, the machine information:
FreeNAS version: FreeNAS-11.1-RELEASE
Motherboard: ASRock Motherboard ATX DDR3 1066 Intel LGA 2011 EP2C602-4L/D16
CPU: 2x Xeon E5-2680 v2 @ 2.80GHz
CPU Cooler: 2x Noctua i4
RAM: 56 GB (two 4GB sticks went bad that I still need to replace)
PSU: EVGA SuperNOVA 850 T2
Case: Phanteks Enthoo Pro
Storage:
6x WD Red PRO 6TB
2x Samsung 850 EVO 250GB (I have these in a current zpool and they have been working flawlessly but they were also bought brand new)

While running badblocks on all my larger drives, I found that 4 of them are reporting similar information like the following:
Code:
1094112190
Too many bad blocks, aborting test
done
Reading and comparing: Too many bad blocks, aborting test1073741823/0 errors)
done
Testing with pattern 0x55: Too many bad blocks, aborting test1073741823/0 errors)
done
Reading and comparing: Too many bad blocks, aborting test1073741823/0 errors)
done
Testing with pattern 0xff: Too many bad blocks, aborting test1073741823/0 errors)
done
Reading and comparing: Too many bad blocks, aborting test1073741823/0 errors)
done
Testing with pattern 0x00: Too many bad blocks, aborting test1073741823/0 errors)
done
Reading and comparing: Too many bad blocks, aborting test1073741823/0 errors)
done


My concern here is that this is a sign of bad drives and if I should pursue the RMA process for them. (I did verify with WD shortly after buying them that there is still 3 years left for their warranty so if need be, should be a simple process to get them replaced)

My drives are labeled ada0 through ada5, here are the smartl -a results for each:

/dev/ada0:
Code:
root@freenas:~ # smartctl -a /dev/ada0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red Pro
Device Model:	 WDC WD6002FFWX-68TZ4N0
Serial Number:	K1GA0U5B
LU WWN Device Id: 5 000cca 255c48ec2
Firmware Version: 83.H0A83
User Capacity:	6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	7200 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Mon Sep 10 02:12:51 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
										was completed without error.
										Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  113) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		( 813) minutes.
SCT capabilities:			  (0x003d) SCT Status supported.
										SCT Error Recovery Control supported.
										SCT Feature Control supported.
										SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0
  2 Throughput_Performance  0x0005   137   137   054	Pre-fail  Offline	  -	   104
  3 Spin_Up_Time			0x0007   144   144   024	Pre-fail  Always	   -	   443 (Average 471)
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   47
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0
  8 Seek_Time_Performance   0x0005   128   128   020	Pre-fail  Offline	  -	   18
  9 Power_On_Hours		  0x0012   098   098   000	Old_age   Always	   -	   15270
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   47
192 Power-Off_Retract_Count 0x0032   097   097   000	Old_age   Always	   -	   4569
193 Load_Cycle_Count		0x0012   097   097   000	Old_age   Always	   -	   4569
194 Temperature_Celsius	 0x0002   166   166   000	Old_age   Always	   -	   36 (Min/Max 23/47)
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 15251		 -
# 2  Short offline	   Completed without error	   00%	 15238		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



/dev/ada1:
Code:
root@freenas:~ # smartctl -a /dev/ada1
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red Pro
Device Model:	 WDC WD6002FFWX-68TZ4N0
Serial Number:	K1G9YGJB
LU WWN Device Id: 5 000cca 255c485f6
Firmware Version: 83.H0A83
User Capacity:	6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	7200 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Mon Sep 10 02:14:45 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
										was completed without error.
										Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  113) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		( 847) minutes.
SCT capabilities:			  (0x003d) SCT Status supported.
										SCT Error Recovery Control supported.
										SCT Feature Control supported.
										SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0
  2 Throughput_Performance  0x0005   137   137   054	Pre-fail  Offline	  -	   104
  3 Spin_Up_Time			0x0007   141   141   024	Pre-fail  Always	   -	   451 (Average 480)
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   47
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0
  8 Seek_Time_Performance   0x0005   128   128   020	Pre-fail  Offline	  -	   18
  9 Power_On_Hours		  0x0012   098   098   000	Old_age   Always	   -	   15271
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   47
192 Power-Off_Retract_Count 0x0032   095   095   000	Old_age   Always	   -	   6157
193 Load_Cycle_Count		0x0012   095   095   000	Old_age   Always	   -	   6157
194 Temperature_Celsius	 0x0002   176   176   000	Old_age   Always	   -	   34 (Min/Max 23/45)
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 15252		 -
# 2  Short offline	   Completed without error	   00%	 15239		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


/dev/ada2
Code:
root@freenas:~ # smartctl -a /dev/ada2
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red Pro
Device Model:	 WDC WD6002FFWX-68TZ4N0
Serial Number:	K1G9GLUB
LU WWN Device Id: 5 000cca 255c44e2c
Firmware Version: 83.H0A83
User Capacity:	6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	7200 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Mon Sep 10 02:15:42 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
										was completed without error.
										Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  113) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		( 813) minutes.
SCT capabilities:			  (0x003d) SCT Status supported.
										SCT Error Recovery Control supported.
										SCT Feature Control supported.
										SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0
  2 Throughput_Performance  0x0005   136   136   054	Pre-fail  Offline	  -	   108
  3 Spin_Up_Time			0x0007   146   146   024	Pre-fail  Always	   -	   435 (Average 464)
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   47
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0
  8 Seek_Time_Performance   0x0005   128   128   020	Pre-fail  Offline	  -	   18
  9 Power_On_Hours		  0x0012   098   098   000	Old_age   Always	   -	   15271
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   47
192 Power-Off_Retract_Count 0x0032   092   092   000	Old_age   Always	   -	   9916
193 Load_Cycle_Count		0x0012   092   092   000	Old_age   Always	   -	   9916
194 Temperature_Celsius	 0x0002   166   166   000	Old_age   Always	   -	   36 (Min/Max 23/46)
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 15251		 -
# 2  Short offline	   Completed without error	   00%	 15238		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


/dev/ada3
Code:
root@freenas:~ # smartctl -a /dev/ada3
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red Pro
Device Model:	 WDC WD6002FFWX-68TZ4N0
Serial Number:	K1GA0GTB
LU WWN Device Id: 5 000cca 255c48d80
Firmware Version: 83.H0A83
User Capacity:	6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	7200 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Mon Sep 10 02:16:19 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
										was completed without error.
										Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  113) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		( 808) minutes.
SCT capabilities:			  (0x003d) SCT Status supported.
										SCT Error Recovery Control supported.
										SCT Feature Control supported.
										SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0
  2 Throughput_Performance  0x0005   137   137   054	Pre-fail  Offline	  -	   104
  3 Spin_Up_Time			0x0007   141   141   024	Pre-fail  Always	   -	   452 (Average 481)
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   47
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0
  8 Seek_Time_Performance   0x0005   128   128   020	Pre-fail  Offline	  -	   18
  9 Power_On_Hours		  0x0012   098   098   000	Old_age   Always	   -	   15270
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   47
192 Power-Off_Retract_Count 0x0032   093   093   000	Old_age   Always	   -	   8741
193 Load_Cycle_Count		0x0012   093   093   000	Old_age   Always	   -	   8741
194 Temperature_Celsius	 0x0002   166   166   000	Old_age   Always	   -	   36 (Min/Max 23/45)
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 15251		 -
# 2  Short offline	   Completed without error	   00%	 15238		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


/dev/ada4
Code:
root@freenas:~ # smartctl -a /dev/ada4
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red Pro
Device Model:	 WDC WD6002FFWX-68TZ4N0
Serial Number:	K1GA2N9B
LU WWN Device Id: 5 000cca 255c495ad
Firmware Version: 83.H0A83
User Capacity:	6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	7200 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Mon Sep 10 02:16:53 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
										was completed without error.
										Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  113) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		( 813) minutes.
SCT capabilities:			  (0x003d) SCT Status supported.
										SCT Error Recovery Control supported.
										SCT Feature Control supported.
										SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0
  2 Throughput_Performance  0x0005   137   137   054	Pre-fail  Offline	  -	   104
  3 Spin_Up_Time			0x0007   143   143   024	Pre-fail  Always	   -	   445 (Average 475)
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   47
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0
  8 Seek_Time_Performance   0x0005   128   128   020	Pre-fail  Offline	  -	   18
  9 Power_On_Hours		  0x0012   098   098   000	Old_age   Always	   -	   15000
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   47
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   187
193 Load_Cycle_Count		0x0012   100   100   000	Old_age   Always	   -	   187
194 Temperature_Celsius	 0x0002   166   166   000	Old_age   Always	   -	   36 (Min/Max 23/45)
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 14981		 -
# 2  Short offline	   Completed without error	   00%	 14967		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


/dev/ada5
Code:
root@freenas:~ # smartctl -a /dev/ada5
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red Pro
Device Model:	 WDC WD6002FFWX-68TZ4N0
Serial Number:	K1G9XRGB
LU WWN Device Id: 5 000cca 255c4832b
Firmware Version: 83.H0A83
User Capacity:	6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	7200 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Mon Sep 10 02:17:35 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
										was completed without error.
										Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  113) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		( 797) minutes.
SCT capabilities:			  (0x003d) SCT Status supported.
										SCT Error Recovery Control supported.
										SCT Feature Control supported.
										SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000b   100   100   016	Pre-fail  Always	   -	   0
  2 Throughput_Performance  0x0005   138   138   054	Pre-fail  Offline	  -	   100
  3 Spin_Up_Time			0x0007   147   147   024	Pre-fail  Always	   -	   435 (Average 462)
  4 Start_Stop_Count		0x0012   100   100   000	Old_age   Always	   -	   47
  5 Reallocated_Sector_Ct   0x0033   100   100   005	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000b   100   100   067	Pre-fail  Always	   -	   0
  8 Seek_Time_Performance   0x0005   128   128   020	Pre-fail  Offline	  -	   18
  9 Power_On_Hours		  0x0012   098   098   000	Old_age   Always	   -	   15271
 10 Spin_Retry_Count		0x0013   100   100   060	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   47
192 Power-Off_Retract_Count 0x0032   096   096   000	Old_age   Always	   -	   5421
193 Load_Cycle_Count		0x0012   096   096   000	Old_age   Always	   -	   5421
194 Temperature_Celsius	 0x0002   157   157   000	Old_age   Always	   -	   38 (Min/Max 23/43)
196 Reallocated_Event_Count 0x0032   100   100   000	Old_age   Always	   -	   0
197 Current_Pending_Sector  0x0022   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0008   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x000a   200   200   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 15252		 -
# 2  Short offline	   Completed without error	   00%	 15239		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Sorry for what feels like a wall of text for my first major post but I figured more information is always better than less information.

I'm open to any thoughts or suggestions. Figured I could try to exhaust any other leads before pursuing the RMA process.
 
Joined
Jul 3, 2015
Messages
926
At a quick glance of your SMART stats your drives look fine. Have you made a zpool already out of these disks? I've noticed before if the disks are in a pool and then you run badblocks you can get errors. If so and assuming you don't have data on your pool (which you shouldn't if you're doing a burn in with badblocks) then destroy your pool and test again.
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
At a quick glance of your SMART stats your drives look fine. Have you made a zpool already out of these disks? I've noticed before if the disks are in a pool and then you run badblocks you can get errors. If so and assuming you don't have data on your pool (which you shouldn't if you're doing a burn in with badblocks) then destroy your pool and test again.

I did make a zpool with them shortly after completing my build because I didn't read all the documentation I should have. I detached the zpool completely and rebooted before running the SMART tests and attempting bad blocks. Figured it would be prudent to some burn in testing before putting any data on them.
 
Joined
Jul 3, 2015
Messages
926
When you say you 'detached the zpool' did you destroy it and mark disks as new?
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
Yes, I did.
 
Joined
Jul 3, 2015
Messages
926
ok fair enough. Im not sure then.
 
Joined
Jul 3, 2015
Messages
926
Did you fire this command in before you started the badblocks test?

sysctl kern.geom.debugflags=0x10
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
Did you fire this command in before you started the badblocks test?

sysctl kern.geom.debugflags=0x10
I had seen that referenced in the hard drive burn in thread but it seemed to be unnecessary. I can certainly try again on one of the drives with that flag set.

For reference, this is what made me think it wasn't necessary:
To summarize, this option should generally not be needed. It only makes it possible to harm data. Any disk you are going to overwrite with data should not be mounted or have anything you wish to keep. In fact, best practice is to not be erasing or stress-testing drives on a system that has actual data on it. Since those disks will not have mounted filesystems, this sysctl will not affect being able to write to them. In fact, it will only make it possible to blow away things that are in use.
 
Last edited:
Joined
Jul 3, 2015
Messages
926
You may be right and tbh I'm clutching at straws but worth a try on one.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
All that sysctl does is make it possible to overwrite data that is in use (mounted partitions) on a drive under test. But if it's being tested, nothing should be mounted, and if something is mounted, it shouldn't be under test and definitely should not be overwritten.
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
I will say that it's been running for over 11 hours with no errors so far.

@wblock , can you explain a bit more on what you mean as far as 'mounted partitions'? I have the drives connected but they are not in use by anything and I have detached the volume that I had previously.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
The FreeBSD disk GEOM system has a safety to prevent writing to partitions which are in use. Generally, that means a partition with a filesystem that has been mounted, but can mean other things, like a plain partition being used by another GEOM device like geli(8).

In any case, the idea is to provide a safety: it is generally a bad idea to allow direct writes to a disk device which is already being used. When a user tries to write to one of these devices without going through the thing that has it open, the write is prevented and an error is shown. An example would be having a filesystem partition mounted and trying to write to the partition directly with dd. That will likely corrupt the filesystem.

Setting the sysctl to that value disables the safety. After that, data can be written to in-use disks and partitions. GEOM has been told to not protect that data, so it doesn't.

That sysctl is rarely needed. If something is already using a disk or partition and you try to write directly to it, that safety can save your data and give you an indicator that you should unmount whatever is using that disk or partition before proceeding to overwrite it.
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
Thanks for the info @wblock , that makes it a bit easier to understand what's going on.
Secondary question; how would I check to see what could be using those partitions? I had assumed that by detaching the volume and marking the disks as new that this would have unmounted them.
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
Badblocks test finished on the first of 6 disks. Finished with 0 errors and all tests passed.
Time to move on to the next one and make sure they all come back clean.
 
Joined
Jul 3, 2015
Messages
926
:D
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
how would I check to see what could be using those partitions?
mount will usually be enough to see which partitions are mounted. But there are GEOM commands that can help, like gpart list and gpart status (these are both just special-case versions of geom part list and geom part status. There are other classes that can be seen. On FreeNAS, the ELI class would be worth looking at if there were encrypted partitions. See geom(8).
 

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
I'm of the mindset that if someone posts seeking assistance and gets it resolved that they should share what fixed their problems.
I found that all of the SATA ports on my motherboard I was using are controlled by a Marvell controller (Marvell SE9230 specifically).

Digging into it some more, it appears that FreeNAS has plenty of issues with this controller and was causing the weird errors I saw when running badblocks because freenas would lose communication and unmount the device (see similar ). I had previously purchased a LSI HBA (LSI 9210-8i) for use in a different build (flashed to IT mode) and to rule out if it was indeed the controller or something else, I installed that and attached all 6 drives through.

Sure enough, freenas has been running with 0 errors with those drives on the HBA since September 17th. Doing some script testing before migrating my data and deploying this to my production environment.

I will gladly eat the bullet and say this was my own problem and I should have read the documentation in much more detail (like this one and this one )

Best of luck to anyone who see similar symptoms. And a huge thank you to these forums. I'll try to contribute as much as I can because this system is really robust once you get over the steep(ish) learning curve.

Consider this one resolved.
 
Last edited:

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626

MorkaiTheWolf

Dabbler
Joined
Aug 8, 2018
Messages
32
Why only one at a time? Have I missed or misread something important?

Sent from my mobile phone
I was only testing one drive at a drive because I wasn't as familiar with tmux and kept terminating my sessions rather than disconnecting from them.

Much more familiar now though. :)
 
Status
Not open for further replies.
Top