I'm trying to cancel (or finish) a replace for 4 days now :(
We have: see footer
What happened?
1. Replaced /dev/ada2 WD20 with a WD40 because it showed smart-errors by: shutting down, plugging it new one in (I know now, I should have sep. attach it; replace and then detach the old..)
1.1 Replaced /dev/ada3 WD20 (unfailed) with new WD40
-> resilvering went fine so far for both.
2. Replaced /dev/ada1 WD20 (also no errors on this) with a WD40 because I wanted to replace all of them one-by-one to expand storage anyway.
The replacement WD40 fully failed during resilder up to being dead on SATA-Port.
F***!
3. Then I maybe made a mistake: Replaced the failed WD40 on ada1 with another new WD40 without detaching first, so I had 3 disks in state "replacing"
This went over 2 days resulting in some "loop", resilvering succeded but "zpool status" never went out of "replacing" again, ONLINE but after reboot it always started over.
4. detached the first WD40 which was prev. on ada1 (offline, failed)
5. again resilvering seemed to go fine but now: ada2 started throwing SMART-Errors: another failure. f***
zpool status -v says (currently running a scrub)
-> Now, I dont want to replace ada2 now while ada1 /ada4 (with two disks without any errors!) is in this "replacing"-loop.
Any Ideas? should I detach the "old" ada1 WD20 (still alive, re-attach to a fith SATA-port) or the new one?
Will it help?
-> Don't care about the files(snapshots) which are reported as error, these are unimportant and could be deleted or restored from backup..
I think replacing ada2 (again) makes less sense until ada1 is healthy replaced(?)
Hope that was somewhat explained correctly..
best regards, Michael
Edit: 2 / 4 brand new WD40 Red being nearly DOA is another sad part of the story..
We have: see footer
What happened?
1. Replaced /dev/ada2 WD20 with a WD40 because it showed smart-errors by: shutting down, plugging it new one in (I know now, I should have sep. attach it; replace and then detach the old..)
1.1 Replaced /dev/ada3 WD20 (unfailed) with new WD40
-> resilvering went fine so far for both.
2. Replaced /dev/ada1 WD20 (also no errors on this) with a WD40 because I wanted to replace all of them one-by-one to expand storage anyway.
The replacement WD40 fully failed during resilder up to being dead on SATA-Port.
F***!
3. Then I maybe made a mistake: Replaced the failed WD40 on ada1 with another new WD40 without detaching first, so I had 3 disks in state "replacing"
This went over 2 days resulting in some "loop", resilvering succeded but "zpool status" never went out of "replacing" again, ONLINE but after reboot it always started over.
4. detached the first WD40 which was prev. on ada1 (offline, failed)
5. again resilvering seemed to go fine but now: ada2 started throwing SMART-Errors: another failure. f***
zpool status -v says (currently running a scrub)
Code:
pool: zvol-wd20 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Mon Feb 20 18:57:20 2017 914G scanned out of 5.12T at 249M/s, 4h56m to go 218G repaired, 17.44% done config: NAME STATE READ WRITE CKSUM zvol-wd20 ONLINE 0 0 6 raidz1-0 ONLINE 0 0 12 (ada0) gptid/81d8ade0-ac29-11e6-8e79-941882377300 ONLINE 0 0 0 replacing-1 ONLINE 0 0 1 (ada1) gptid/82d21e89-ac29-11e6-8e79-941882377300 ONLINE 0 0 0 (repairing) (ada4) gptid/7474530a-f5d6-11e6-9b8a-941882377300 ONLINE 0 0 0 (repairing) (ada2) gptid/072c9083-f3c6-11e6-9dba-941882377300 ONLINE 6 0 0 (repairing) (ada3) gptid/93e2855f-f43b-11e6-b05b-941882377300 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: <0x1350>:<0x13936> <0x1396>:<0x29da11> <0x1396>:<0x29d9f1> zvol-wd20/backup@auto-20170124.0300-1m:/v3750/duplicity-full.20161028T010314Z.vol2900.difftar.gpg
-> Now, I dont want to replace ada2 now while ada1 /ada4 (with two disks without any errors!) is in this "replacing"-loop.
Any Ideas? should I detach the "old" ada1 WD20 (still alive, re-attach to a fith SATA-port) or the new one?
Will it help?
-> Don't care about the files(snapshots) which are reported as error, these are unimportant and could be deleted or restored from backup..
I think replacing ada2 (again) makes less sense until ada1 is healthy replaced(?)
Hope that was somewhat explained correctly..
best regards, Michael
Edit: 2 / 4 brand new WD40 Red being nearly DOA is another sad part of the story..