Burlumpu Bumpu
Dabbler
- Joined
- Jun 23, 2018
- Messages
- 20
Dear community
It's still a learning process for me, as I shall demonstrate now:
I am running a RAIDZ2 on a DELL R510 with a crossflashed H200. Today, my RAIDZ2-Pool was marked degraded, there were some write errors in one disk.
My plan was (and is) to get a replacement for this disk and resilver the pool (as I have done before). To make sure (now comes the beginner part :) ) there's nothing wrong with the backplane or some cabling, I shut down the server and took a look at the failing harddrive, the backplane and generally my server.
I then restarted and now there's a problem: FreeNas keeps trying to resilver my Pool, all the disks (including the faulty one) are now online. The resilver process works it's way to 1% and starts again from 0%.
What should I do now in order to replace the failed disk? Can I replace a disk while the pool is resilvering? There are no SMART errors in any of the disks.
da7 is the failing drive..:
Thanks and kind regards
Alex
It's still a learning process for me, as I shall demonstrate now:
I am running a RAIDZ2 on a DELL R510 with a crossflashed H200. Today, my RAIDZ2-Pool was marked degraded, there were some write errors in one disk.
My plan was (and is) to get a replacement for this disk and resilver the pool (as I have done before). To make sure (now comes the beginner part :) ) there's nothing wrong with the backplane or some cabling, I shut down the server and took a look at the failing harddrive, the backplane and generally my server.
I then restarted and now there's a problem: FreeNas keeps trying to resilver my Pool, all the disks (including the faulty one) are now online. The resilver process works it's way to 1% and starts again from 0%.
What should I do now in order to replace the failed disk? Can I replace a disk while the pool is resilvering? There are no SMART errors in any of the disks.
Code:
[root@freenas ~]# zpool status pool: RAIDZ2 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Wed May 1 00:48:46 2019 78.7G scanned at 167M/s, 50.9G issued at 108M/s, 16.4T total 5.96G resilvered, 0.30% done, 1 days 20:11:54 to go config: NAME STATE READ WRITE CKSUM RAIDZ2 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/fbbfea97-78cb-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 gptid/ff414dca-78cb-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 gptid/02bb0a27-78cc-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 gptid/0645a295-78cc-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 gptid/7f2b5a7c-f102-11e8-84c2-842b2b4f1e06 ONLINE 0 3667 gptid/0d636c3e-78cc-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 gptid/10eaa592-78cc-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 gptid/14762089-78cc-11e8-8f71-842b2b4f1e06 ONLINE 0 0 0 cache gptid/73396a40-b812-11e8-8f7e-842b2b4f1e06 ONLINE 0 0 0 errors: No known data errors pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0 days 00:59:13 with 0 errors on Wed Apr 24 04:44:13 2019 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9p2 ONLINE 0 0 0 da8p2 ONLINE 0 0 0 errors: No known data errors
da7 is the failing drive..:
Code:
[root@freenas ~]# smartctl -a /dev/da7 smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: IBM-XIV Product: ST4000NM0043 C1 Revision: EC5C Compliance: SPC-4 User Capacity: 4,000,787,030,016 bytes [4.00 TB] Logical block size: 512 bytes LU is fully provisioned Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c50083c28b3f Serial number: Z1Z8TF6D0000R546K5WD Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Wed May 1 00:57:44 2019 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 38 C Drive Trip Temperature: 65 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 23000.42 number of minutes until next internal SMART test = 4 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 3928830803 0 0 3928830803 0 600550.787 0 write: 0 0 0 0 0 224974.613 0 verify: 988837325 0 0 988837325 0 6966.842 0 Non-medium error count: 161 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 23000 - [- - -] # 2 Background short Completed - 22975 - [- - -] # 3 Background short Completed - 22951 - [- - -] # 4 Background short Completed - 22927 - [- - -] # 5 Background short Completed - 22903 - [- - -] # 6 Background short Completed - 22879 - [- - -] # 7 Background short Completed - 22855 - [- - -] # 8 Background short Completed - 22831 - [- - -] # 9 Background short Completed - 22807 - [- - -] #10 Background short Completed - 147 - [- - -] #11 Background short Completed - 113 - [- - -] #12 Background long Completed - 56 - [- - -] #13 Background short Completed - 50 - [- - -] #14 Background short Aborted (by user command) - 37 - [- - -] Long (extended) Self Test duration: 32700 seconds [545.0 minutes]
Thanks and kind regards
Alex
Last edited: