rmccullough
Patron
- Joined
- May 17, 2018
- Messages
- 269
I suspect I have a failed drive, but want to be sure before I "replace" it with a hot backup.
I received an alert the other day:
I read some posts here that also suggested getting the output of smartctl:
I hate to just assume the drive is bad, but it looks like it is. Anything else I should try before I start the replacement process?
Is this the right process to replace the disk: Replacement
I received an alert the other day:
So I logged into shell and ran a couple of commands:* Pool Tank state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
- Disk 14681939649493252335 is FAULTED
Code:
root@freenas[~]# zpool status
pool: Tank
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 2.63M in 04:50:57 with 0 errors on Sat Jan 1 04:51:00 2022
config:
NAME STATE READ WRITE CKSUM
Tank DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/3aceec98-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
gptid/850946ad-bd0e-11e8-9a6f-0cc47a303600 ONLINE 0 0 0
gptid/3fe310b6-a7d1-11e8-b311-0cc47a303600 FAULTED 147 0 0 too many errors
gptid/40f281b9-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
gptid/447abee3-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
gptid/458f64b0-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
gptid/491ca6b0-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
gptid/4a41f4ea-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
gptid/4dcf9801-a7d1-11e8-b311-0cc47a303600 ONLINE 0 0 0
errors: No known data errorsI read some posts here that also suggested getting the output of smartctl:
Code:
root@freenas[~]# smartctl -a /dev/da1
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p10 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS724020ALS641
Revision: MS04
Compliance: SPC-4
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Logical block size: 512 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca06d0c350c
Serial number: P5G6R3RV
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sat Jan 1 10:30:55 2022 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 23 C
Drive Trip Temperature: 55 C
Accumulated power on time, hours:minutes 29488:15
Manufactured in week 06 of year 2015
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 26
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1255
Elements in grown defect list: 571
Vendor (Seagate Cache) information
Blocks sent to initiator = 22796998390317056
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 12348 600 0 12948 4737783 52691.712 4
write: 0 0 0 0 547287 28130.329 0
verify: 0 0 0 0 171204 0.000 0
Non-medium error count: 0
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 29382 - [- - -]
# 2 Background short Completed - 29214 - [- - -]
# 3 Background long Completed - 29050 - [- - -]
# 4 Background short Completed - 28878 - [- - -]
# 5 Background short Completed - 28662 - [- - -]
# 6 Background short Completed - 28494 - [- - -]
# 7 Background long Completed - 28330 - [- - -]
# 8 Background short Completed - 28157 - [- - -]
# 9 Background short Completed - 27917 - [- - -]
#10 Background short Completed - 27749 - [- - -]
#11 Background long Completed - 27588 - [- - -]
#12 Background short Completed - 27412 - [- - -]
#13 Background short Completed - 27197 - [- - -]
#14 Background short Completed - 27029 - [- - -]
#15 Background long Completed - 26865 - [- - -]
#16 Background short Completed - 26693 - [- - -]
#17 Background short Completed - 26453 - [- - -]
#18 Background short Completed - 26285 - [- - -]
#19 Background long Completed - 26122 - [- - -]
#20 Background short Completed - 25949 - [- - -]
Long (extended) Self-test duration: 6 seconds [0.1 minutes]I hate to just assume the drive is bad, but it looks like it is. Anything else I should try before I start the replacement process?
Is this the right process to replace the disk: Replacement