Resilvering issues

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
Hello,

Had a disk failure so no big deal and started a resilver with a fresh disk. I noticed subsequently that I ended up in a resilvering loop and once completed, the resilver would occur again. Another disk was throwing up write and checksum errors. So I replaced that and then my pool went offline.

I reinserted the disk I had just taken out (which was throwing up errors) and pool returned in degraded state with a couple of disks unavailable. Then while starting to troubleshoot another disk was removed by the system and now my pool is unavilable however, it is resilvering, though with 3 disks down my pool should be dead?

Im a little bit lost now on the best way to proceed to resolving the issue. The resilver is not really doing anything and im likely to be dead if it ever completes.

zpool status -v output below shows status and data errors

Code:
  pool: AllDiskVolume
 state: UNAVAIL
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Jan 30 10:44:53 2021
        1.84T scanned at 512M/s, 201M issued at 54.6K/s, 26.4T total
        0 resilvered, 0.00% done, no estimated completion time
config:

        NAME                                            STATE     READ WRITE CKSUM
        AllDiskVolume                                   UNAVAIL      1     0     0
          raidz2-0                                      UNAVAIL      2     0     0
            gptid/97d632a5-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0     0
            3071377810018719211                         UNAVAIL      0     0     0  was /dev/gptid/98968966-c1b7-11e5-86a6-d050995bee6c
            17494332646787331714                        UNAVAIL      0     0     0  was /dev/gptid/9951a8ad-601c-11eb-a9be-d050995bee6c
            gptid/9a46ed13-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0     0
            gptid/62251196-d9de-11e6-8f47-d050995bee6c  ONLINE       0     0     0
            16332921663242939337                        REMOVED      0     0     0  was /dev/gptid/9c778901-c1b7-11e5-86a6-d050995bee6c
            gptid/89295a56-1d10-11e7-809c-d050995bee6c  ONLINE       0     0     0
            gptid/9e0820a8-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0     0
            gptid/9ec371f0-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0     0
            gptid/9f7f41f2-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x1>
        <metadata>:<0x114>
        <metadata>:<0x4d>
        <metadata>:<0x50>
        <metadata>:<0x260>
        <metadata>:<0x6b>
        <metadata>:<0x185>
        <metadata>:<0x290>
        <metadata>:<0xa1>
        <metadata>:<0xc6>
        <metadata>:<0xc8>
        <metadata>:<0xd5>
        <metadata>:<0xed>
        <metadata>:<0x1f5>
        <metadata>:<0x1fc>
        AllDiskVolume/iocage/releases/11.3-RELEASE/root:<0x0>
        AllDiskVolume/iocage/releases/11.3-RELEASE/root:<0x2b6>
        AllDiskVolume/MShare:<0x11db>
        AllDiskVolume/.system/rrd-c0708eef149b49a396f66530a6f1b80a:<0xf1>
        AllDiskVolume/iocage/jails/MinecraftS/root:<0x2460a>

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:46 with 0 errors on Sat Jan 30 03:45:46 2021
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Unfortunately, there may not be any way to recover your pool. What models of disks do you have constituting your pool? You may have used SMR drives, which are known to behave badly on resilvers.

 

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
Unfortunately, there may not be any way to recover your pool. What models of disks do you have constituting your pool? You may have used SMR drives, which are known to behave badly on resilvers.

Hi theyre are all CMR unless you know .. the manufactueres lied.

All 3Tb drives, mixture between seagate and toshiba. running fine and resilvered before without issues.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, thanks for confirming. Try replacing the SATA cables on the disks that've become unavailable, and the one you're resilvering. Sometimes this will get them back online, and you may be able to complete the resilver.
 

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
I have added an 11th disk to the system, new one plus the one giving errors (didnt think my motherboard supported 11 disks from memory but guess it does). Rebooted and this has brought the pool back online. resilver has kicked in again. However it still doesn't look very happy though the permenant errors are no longer being recognised and resilver actually appears to be progressing.

Code:
  pool: AllDiskVolume
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Jan 30 15:00:05 2021
        2.02T scanned at 3.24G/s, 243M issued at 871K/s, 26.4T total
        0 resilvered, 0.00% done, no estimated completion time
config:

        NAME                                            STATE     READ WRITE CKS                                                                                                                                                                                                                                             UM
        AllDiskVolume                                   DEGRADED     0     0                                                                                                                                                                                                                                                  0
          raidz2-0                                      DEGRADED     0     0                                                                                                                                                                                                                                                  0
            gptid/97d632a5-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/98968966-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                 27
            ada7p2                                      FAULTED      1   141                                                                                                                                                                                                                                                 10  too many errors
            gptid/9a46ed13-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/62251196-d9de-11e6-8f47-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9c778901-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/89295a56-1d10-11e7-809c-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9e0820a8-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9ec371f0-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9f7f41f2-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:46 with 0 errors on Sat Jan 30 03:45:46                                                                                                                                                                                                                                              2021
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors
 

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
#UPDATE#

So I forced ada7 back online from faulted however im still in a resilvering loop.

It wont let me take it offline or replace it with my other spare, I assume because its resilvering? Should I force offline the disk so I can resilver using the new one?

Code:
  pool: AllDiskVolume
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jan 31 22:44:52 2021
        6.18T scanned at 3.20G/s, 1.74T issued at 922M/s, 26.1T total
        169G resilvered, 6.65% done, 0 days 07:42:09 to go
config:

        NAME                                            STATE     READ WRITE CKS                                                                                                                                                                                                                                             UM
        AllDiskVolume                                   ONLINE       0     0                                                                                                                                                                                                                                                  0
          raidz2-0                                      ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/97d632a5-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/98968966-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                 27
            ada7p2                                      ONLINE       1   141                                                                                                                                                                                                                                                 10
            gptid/9a46ed13-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/62251196-d9de-11e6-8f47-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9c778901-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/89295a56-1d10-11e7-809c-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9e0820a8-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9ec371f0-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9f7f41f2-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:46 with 0 errors on Sat Jan 30 03:45:46                                                                                                                                                                                                                                              2021
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Unfortunately, you'll have to wait for the resilver to fail before putting ada7p2 offline to replace it. It will eventually fail.
 

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
There must be a way to safely stop the re-silver or force the disk offline. I dont think the disk is going to fail anytime soon. I'm currently on my 4th pool resilver since my last post. I dont need my other disks dying from taxing with multiple resilvers.

Is it worth using zpool clear command seeing as zpool -v is not bringing up any specific errors? I have read that this can resolve resilvering loops over on the BSD forums?
 

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
Still in a resilvering loop. Anybody any ideas on how to resolve?

Code:
  pool: AllDiskVolume
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Feb  4 19:59:37 2021
        9.69T scanned at 1.63G/s, 5.27T issued at 907M/s, 26.1T total
        514G resilvered, 20.18% done, 0 days 06:41:54 to go
config:

        NAME                                            STATE     READ WRITE CKS                                                                                                                                                                                                                                             UM
        AllDiskVolume                                   ONLINE       0     0                                                                                                                                                                                                                                                  0
          raidz2-0                                      ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/97d632a5-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/98968966-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            ada7p2                                      ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9a46ed13-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/62251196-d9de-11e6-8f47-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9c778901-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/89295a56-1d10-11e7-809c-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9e0820a8-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9ec371f0-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0
            gptid/9f7f41f2-c1b7-11e5-86a6-d050995bee6c  ONLINE       0     0                                                                                                                                                                                                                                                  0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:46 with 0 errors on Sat Jan 30 03:45:46                                                                                                                                                                                                                                              2021
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors
 

nemisisak

Explorer
Joined
Jun 19, 2015
Messages
69
Just an update, the pool is still in a resilvering loop. In the process of second server build and will migrate the data.

I believe the reason behind the loop is a small amount of metablock corruption which for some reason the system cannot resolve. Im not sure why this is as I have 9/10 disks in a z2 array. Once I migrate the data, I will then destroy the pool and rebuild.

On a side note I will get a 10gb card to speed up the migration process. Anyone have an recomendations? My switch supports rj45 and spf so either or is fine. Probably go spf though tbh as its generally cheaper and has a much lower heat output.

Cheers.
 
Top