Resilvering Repeating after Reboot

Status
Not open for further replies.

Joseph Sharbutt

Dabbler
Joined
Apr 12, 2014
Messages
28
I have been using FreeNAS for a couple of years now and have been very happy with the initial setup and performance the entire time.

I recently decided that I was going to expand a RAID-Z2 pool by replacing 6x2TB drives with 6x4TB to double my capacity.

I following all instructions in the manual to the letter while replacing the first drive. The resilver seemed to go off without a hitch. The problem is that every time I reboot, it attempts to resilver the new drive again. I am reluctant to replace any more drives until I know that the first one is 100% "accepted" in to the new pool after a resilver. Until I can reboot without the resilver process restarting, I am not comfortable continuing with the drive replacements.

I do have one data error on a file that is not important. I have since deleted that file and all snapshots that contain that file.

Will performing a scrub after the resilver resolve this issue? Each of these tasks require 12-24 hours to complete and this is just the first of six drives that I need to replace. This is the first time that I have needed to do any kind of drive replacement with this pool, and although it still seems solid, I feel like I'm missing something.

After the first resilver, the pool was no longer degraded, but still continues to restart the resilver over and over. I'm afraid that once the resilver starts, I have to let it finish before doing anything else. This is taking quite some time. I need a better system for this.

Thanks in advance for the assistance.
 
D

dlavigne

Guest
Which version of FreeNAS? Please post the output of zpool status in code tags.
 

Joseph Sharbutt

Dabbler
Joined
Apr 12, 2014
Messages
28
FreeNAS-9.3-STABLE-201503070129

Code:
[root@fnas] ~# zpool status
  pool: fNASRaidZ2Vol
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Mar  6 22:57:57 2015
        3.89T scanned out of 8.44T at 180M/s, 7h21m to go
        660G resilvered, 46.03% done
config:

        NAME                                            STATE     READ WRITE CKSUM
        fNASRaidZ2Vol                                   ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/ee89d42b-66ea-11e3-9fc0-ac220b822227  ONLINE       0     0     0
            gptid/eee7c89f-66ea-11e3-9fc0-ac220b822227  ONLINE       0     0     0
            gptid/ef45ef3f-66ea-11e3-9fc0-ac220b822227  ONLINE       0     0     0
            gptid/efa70e35-66ea-11e3-9fc0-ac220b822227  ONLINE       0     0     0
            gptid/06f2f984-c299-11e4-a2a6-0015175bfcfe  ONLINE       0     0     0  (resilvering)
            gptid/f0bd0eb4-66ea-11e3-9fc0-ac220b822227  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0h34m with 0 errors on Thu Feb  5 04:21:23 2015
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors
[root@fnas] ~#
 

Joseph Sharbutt

Dabbler
Joined
Apr 12, 2014
Messages
28
This is the third or fourth time that I have tried to restart the server and every time it has restarted the resilver process. I deleted the files that it was giving me errors on, so I think I have that taken care of now. Am I required to scrub after the resilver for it to accept this new drive permanently?

Also, I am reluctant to do this, but if I attempt to replace another drive prior to rebooting, will that make a difference?

After the resilver completes, everything goes green and healthy until I reboot.

I also have another question. When the resilver process restarts, does it wipe the drive that is being resilvered? The reason I ask, is that it is not reporting as being degraded as it did in the beginning, which makes me think that to correct data is already on the new drive, it is just repeating the process for some reason, almost like it is just checking to make sure the previous resilver was correct. It also seems to be going a bit faster than the first time around.
 
Last edited:

sweeze

Dabbler
Joined
Sep 23, 2013
Messages
24
I'm chiming in with this happening to me as well.

I have a two-vdev mirrored pool and in my case I had a file that had an error, but I didn't care about the file, removed it, killed a snapshot it was in, and have had repeated resilvering on two reboots.

I am a little wary of this because I originally had a raidz with an erroring disk and figured it was time to move to mirroring, bought more disks and made a two disk stripe, zfs send/recved raidz to new pool, destroyed the old pool, attached two of the old disks to mates in the new one, they silvered fine and then when I rebooted to take the failing disk out it resilvered again. And then again after another reboot to install an SSD.

Erroring disk never faulted.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
ZFS will "resilver" a drive if things look to be totally and horribly out of place with that particular device. This is pretty common on RAIDZ1 pools because you end up with some corruption and suddenly ZFS loses its mind.

With RAIDZ2 its less common, so you might want to ask yourself what the problem is. Can you post a debug? Joseph? Got some ideas but the debug will confirm or deny my ideas.
 
Status
Not open for further replies.
Top