Do I have to change these HDs?

Ahmed Badr · Oct 24, 2016

Hi,

I have a white box FreeNAS system with 6 SATA drives configured in RAIDZ2. The alert system just surprised me with critical errors on 2 drives. I'm officially panicking. Some of the data on the system is not backed up but I'm afraid to offload it on another computer. Will the load of copying a couple of TB from a degraded system push it over the edge? happened to me in the past with Linux MD

But the important questions are:

Do i need to replace these 2 drives? or is there a way to fix these errors? smartctl output says both drives PASSED.
If I have to replace them, do I replace them both at once or one at a time? Witch one do I replace first, the one with 1 uncorrectable sector or the one with 48 pending and 41 uncorrectable sectors?

I've read in another post that I can use the command "dd" to write over the bad sectors and that ZFS will rebuild the data and write it on a new sector. Any advice or write ups on this process? the links in the post are long gone.

Here are the smartctl outputs for both drives?

Alert System:

Code:

CRITICAL: Oct. 23, 2016, 2:22 p.m. - Device: /dev/ada4, 1 Offline uncorrectable sectors
CRITICAL: Oct. 23, 2016, 2:22 p.m. - Device: /dev/ada5, 48 Currently unreadable (pending) sectors
CRITICAL: Oct. 23, 2016, 2:22 p.m. - Device: /dev/ada5, 41 Offline uncorrectable sectors

smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Green
Device Model: WDC WD30EZRX-00SPEB0
Firmware Version: 80.00A80
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Oct 23 13:38:23 2016 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (43200) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 433) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x7035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 40
3 Spin_Up_Time 0x0027 178 176 021 Pre-fail Always - 8100
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 54
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0
9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 9293
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 54
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 31
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 878843
194 Temperature_Celsius 0x0022 112 103 000 Old_age Always - 40
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 13

SMART Error Log Version: 1
ATA Error Count: 13 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 13 occurred at disk power-on lifetime: 181 hours (7 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 00 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef aa 00 00 00 00 40 00 02:08:03.666 SET FEATURES [Enable read look-ahead]
ef 03 46 00 00 00 40 00 02:08:03.663 SET FEATURES [Set transfer mode]
ef 03 46 00 00 00 40 00 02:08:03.661 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 02:08:03.660 IDENTIFY DEVICE

Error 12 occurred at disk power-on lifetime: 181 hours (7 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 46 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef 03 46 00 00 00 40 00 02:08:03.663 SET FEATURES [Set transfer mode]
ef 03 46 00 00 00 40 00 02:08:03.661 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 02:08:03.660 IDENTIFY DEVICE

Error 11 occurred at disk power-on lifetime: 181 hours (7 days + 13 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 46 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef 03 46 00 00 00 40 00 02:08:03.661 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 02:08:03.660 IDENTIFY DEVICE
ca 00 08 a0 87 eb 42 08 02:06:32.335 WRITE DMA

Error 10 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 e0 a0 00 40 40 Device Fault; Error: ABRT 224 sectors at LBA = 0x004000a0 = 4194464

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 e0 a0 00 40 40 00 00:04:43.100 READ DMA
ef aa 00 00 00 00 40 00 00:04:43.098 SET FEATURES [Enable read look-ahead]
ef 03 46 00 00 00 40 00 00:04:43.095 SET FEATURES [Set transfer mode]
ef 03 46 00 00 00 40 00 00:04:43.093 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 00:04:43.092 IDENTIFY DEVICE

Error 9 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 00 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef aa 00 00 00 00 40 00 00:04:43.098 SET FEATURES [Enable read look-ahead]
ef 03 46 00 00 00 40 00 00:04:43.095 SET FEATURES [Set transfer mode]
ef 03 46 00 00 00 40 00 00:04:43.093 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 00:04:43.092 IDENTIFY DEVICE
c8 00 e0 a0 00 40 40 08 00:03:11.767 READ DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 5941 -
# 2 Short offline Completed without error 00% 5940 -
# 3 Short offline Completed without error 00% 5939 -
# 4 Short offline Completed without error 00% 5938 -
# 5 Short offline Completed without error 00% 5937 -
# 6 Short offline Completed without error 00% 5936 -
# 7 Short offline Completed without error 00% 5935 -
# 8 Short offline Completed without error 00% 5934 -
# 9 Short offline Completed without error 00% 5933 -
#10 Short offline Completed without error 00% 5932 -
#11 Short offline Completed without error 00% 5931 -
#12 Short offline Completed without error 00% 5930 -
#13 Short offline Completed without error 00% 5929 -
#14 Short offline Completed without error 00% 5928 -
#15 Short offline Completed without error 00% 5927 -
#16 Short offline Completed without error 00% 5926 -
#17 Short offline Completed without error 00% 5925 -
#18 Short offline Completed without error 00% 5924 -
#19 Short offline Completed without error 00% 5923 -
#20 Short offline Completed without error 00% 5922 -
#21 Short offline Completed without error 00% 5921 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Green
Device Model: WDC WD30EZRX-00SPEB0
Firmware Version: 80.00A80
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Oct 23 13:38:37 2016 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (40200) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 403) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x7035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 197 197 051 Pre-fail Always - 7249
3 Spin_Up_Time 0x0027 178 175 021 Pre-fail Always - 8058
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 52
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0
9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 9289
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 52
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 29
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 882361
194 Temperature_Celsius 0x0022 114 105 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 48
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 41
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 2
200 Multi_Zone_Error_Rate 0x0008 184 184 000 Old_age Offline - 6600

SMART Error Log Version: 1
ATA Error Count: 3
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 3 occurred at disk power-on lifetime: 130 hours (5 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 00 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef aa 00 00 00 00 40 00 00:21:00.289 SET FEATURES [Enable read look-ahead]
ef 03 46 00 00 00 40 00 00:21:00.286 SET FEATURES [Set transfer mode]
ef 03 46 00 00 00 40 00 00:21:00.284 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 00:21:00.284 IDENTIFY DEVICE

Error 2 occurred at disk power-on lifetime: 130 hours (5 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 46 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef 03 46 00 00 00 40 00 00:21:00.286 SET FEATURES [Set transfer mode]
ef 03 46 00 00 00 40 00 00:21:00.284 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 00:21:00.284 IDENTIFY DEVICE

Error 1 occurred at disk power-on lifetime: 130 hours (5 days + 10 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 46 00 00 00 40 Device Fault; Error: ABRT

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ef 03 46 00 00 00 40 00 00:21:00.284 SET FEATURES [Set transfer mode]
ec 00 00 00 00 00 40 00 00:21:00.284 IDENTIFY DEVICE

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 285 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I appreciate your advise
Thanks

skyline65 · Oct 24, 2016

Im sure the experts will help you out.
One thing that strikes me is the:
Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 882361 after 9289 hours

My WD Greens have a Load_Cycle_Count of 2300 and have been powered 24000 hours

Did you run the WDIDLE on the WD Greens?

https://forums.freenas.org/index.php?threads/wd-green-load-cycle-question.16912/

I wouldn’t use/do this til you have been advised about your disk problem first!

Bidule0hm · Oct 24, 2016

ada4 is ok-ish but we need the full output of smartctl -a /dev/ada4 to be sure.

ada5 is failing, you should replace it asap.

NB: RAID doesn't replace backups, you should already have backups of any important data; backup your data now !

Ahmed Badr · Oct 24, 2016

Bidule0hm said:
ada4 is ok-ish but we need the full output of smartctl -a /dev/ada4 to be sure.

Please find it in the first post of this thread for both drives (click show)

Ahmed Badr · Oct 24, 2016

skyline65 said:
Im sure the experts will help you out.
One thing that strikes me is the:
Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 882361 after 9289 hours

My WD Greens have a Load_Cycle_Count of 2300 and have been powered 24000 hours

Did you run the WDIDLE on the WD Greens?

https://forums.freenas.org/index.php?threads/wd-green-load-cycle-question.16912/

I wouldn’t use/do this til you have been advised about your disk problem first!

I was not aware of this hole WDIDLE thing at all. Found a few posts and links about it and will start reading now.... a quick question however:
Since I'll be replacing the drives should I run WDIDLE on the new drives on another pc before installing them in FreeNAS?

Thanks

danb35 · Oct 24, 2016

Ahmed Badr said:
Since I'll be replacing the drives should I run WDIDLE on the new drives on another pc before installing them in FreeNAS?

It would be a good idea.

Edit: I'd replace the disk with 48 errors first (after thoroughly burning in the replacement), then the other one. And if you have a way to connect the replacement disk without first removing the old one, so much the better--you can resilver without having to compromise your redundancy. The manual has click-by-click instructions on replacing drives; follow them and you should be fine.

And then take a look at your SMART test schedule. Right now, ada4 is running SMART short tests every hour*, while ada5 has only run one short test during its lifetime. I run short tests daily and long weekly, which is on the aggressive side of normal; you could go as infrequently as a short test weekly and a long test monthly if you really wanted to. But you should be running both on a regular schedule.

*Edit 2: No, it isn't. It was for a while, but that's been 4000 hours ago.

skyline65 · Oct 24, 2016

Once you have sorted your disks out it maybe worth looking at this. I found it very useful.

https://forums.freenas.org/index.ph...d-identification-and-backup-the-config.27365/

Stux · Oct 24, 2016

Ahmed Badr said:
I was not aware of this hole WDIDLE thing at all. Found a few posts and links about it and will start reading now.... a quick question however:
Since I'll be replacing the drives should I run WDIDLE on the new drives on another pc before installing them in FreeNAS?

Thanks

I would replace the drives with wd reds or seagate nas HDs. No need for wdidle then either.

Btw the drive with one error doesn't need to be replaced yet unless it starts throwing more errors. Consider it on probation ;)

Your short tests are happening to often, means we can't see the results of the long test.

Basically the long test will stop at the lba of a failed block. You overwrite it with zeros. That should 'fix' the block.

danb35 · Oct 24, 2016

Stux said:
Your short tests are happening to often, means we can't see the results of the long test.

...assuming there ever were long tests. But since the last of those short tests was 4000 hours ago, they aren't particularly meaningful today.

Stux · Oct 24, 2016

Which leads to what I think is a defect in FreeNAS. When a drive changes name it drops out of smart.

danb35 · Oct 25, 2016

Stux said:
When a drive changes name it drops out of smart.

I thought I'd seen that it uses internal drive IDs which are based on the serial numbers. So if the same drive changes names from ada2 to ada3 it would still be included, but if you replace ada2 with a different drive (different s/n), it now won't be in the SMART test schedule. And yes, I'd agree that this behavior is buggy, or at least not expected. Perhaps, at a minimum, a bug against the docs (that point should be made explicit in the drive replacement instructions)...

Stux · Oct 25, 2016

danb35 said:
I thought I'd seen that it uses internal drive IDs which are based on the serial numbers. So if the same drive changes names from ada2 to ada3 it would still be included, but if you replace ada2 with a different drive (different s/n), it now won't be in the SMART test schedule. And yes, I'd agree that this behavior is buggy, or at least not expected. Perhaps, at a minimum, a bug against the docs (that point should be made explicit in the drive replacement instructions)...

I can reproduce it quite easily by shuffling drives between mobo sata and hba. Do it enough times and you don't have any smart tests running anymore

this means that the names change from ada? to da?

Bidule0hm · Oct 25, 2016

Just to add something that was ignored: you should keep your drives under 40 °C at all times. I can guess that the high temp plus the very high LCC have lead to a premature death of the drive(s).

Robert Trevellyan · Oct 25, 2016

Ahmed Badr said:
I've read in another post that I can use the command "dd" to write over the bad sectors

This is not a reliable solution. Flaky sectors that deliver read failures often appear fine when writing. This very month I RMAed one of my drives that was doing exactly this. I pulled it from the system and ran badblocks on it, and each write pass came through without errors, followed by errors on the read pass. The pending sectors kept falling to 0, then rising again, with the read error rate rising rapidly too. At no time during the badblocks run did any sectors get reallocated.

Stux · Oct 25, 2016

Exactly, if the single smart error comes back I'd replace the drive, but if it's old, and it doesn't, I wouldn't.

Ahmed Badr · Oct 25, 2016

Thanks for your valuable info. I appreciate it.

I now changed the ada5 drive with 48 sector error. The zpool had been resilvered and working properly.

I will be replacing the ada4 with 1 sector in the next couple of days and will keep you updated.

skyline65 · Oct 26, 2016

What make drive did you replace it with?
Also do your other drives have such a high load/unload count? If so the other drives could fail as well. Maybe worth looking looking at the Smart logs for them as well.

Ahmed Badr · Oct 30, 2016

skyline65 said:
What make drive did you replace it with?
Also do your other drives have such a high load/unload count? If so the other drives could fail as well. Maybe worth looking looking at the Smart logs for them as well.

I replaced the drive with a WD Blue drive. I will eventually replace all drives with WD red drives but I cannot seem to find them where I live.
The load_cycle_count for all drives is around 88000.

skyline65 · Oct 31, 2016

That is a lot. I would ask someone whether there is a way to stop them unloading so much, at least you may get some more life out of them.

Stux · Oct 31, 2016

wdidle

Important Announcement for the TrueNAS Community.

Do I have to change these HDs?

Dabbler

Explorer

Server Electronics Sorcerer

Dabbler

Dabbler

Hall of Famer

Explorer

MVP

Hall of Famer

MVP

Hall of Famer

MVP

Server Electronics Sorcerer

Pony Wrangler

MVP

Dabbler

Explorer

Dabbler

Explorer

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Do I have to change these HDs?"

Similar threads