Resliver time seems excessive.

AVB

Contributor
Joined
Apr 29, 2012
Messages
174
I had to replace a drive for the first time since my last size upgrade about 2 years ago. Right now the Pool is 2 RaidZ2 volumes of 8x4TB and 8x8TB and of course it was one of the 8TB drives that failed. Luckily I had bought a spare so after going through the motions to replace the drive it started to resliver. Now the pool is about 55% full or 32TB of data in it. So far I'm 30 hours into the revliver with easily another 6 to go and perhaps more since the estimated time left is pretty worthless.

The question is if anyone thinks this amount of time is excessive? All the drives are 6GB 7200 rpm and with a 16 core processor and 64GB of RAM I though I had a fair amount of horsepower. I know when I did the upgrade doing the one drive at a time way it was only taking about 10-11 hours per drive but I only had 21TB of data back then too. Doing a scrub only takes about 6 hours ir ut did last week.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The estimated time is only an estimate, to actually figure it, you'd really need to traverse the pool and look at how many seek operations remain, which is a very I/O heavy thing to do, and is actually a large part of what needs to happen for the resilver anyways, so it tends to be not horribly accurate.

I don't see any sign offhand that you have SMR drives, but SMR drives are not advisable and can cause really weird performance issues, especially if you are replacing an old CMR drive with a new SMR drive. SMR drives should be considered nearly incompatible with ZFS.

Resilvering a pool of 11 8TB drives used to take about a day and a half to two days here, for large-ish file storage on RAIDZ3.

How long it takes on your pool is going to be a function of the types of data you're storing, how much load you're putting on it, and how bad the fragmentation is. For example, this is kinda pathetic:

Code:
 state: ONLINE
  scan: scrub in progress since Sat Nov 20 22:00:03 2021
        30.7T scanned at 20.0M/s, 29.0T issued at 18.9M/s, 61.3T total
        0 repaired, 47.40% done, 20 days 16:02:03 to go


where the pool is under constant read workloads of 100-200MBytes/sec, and never really gets a chance to "catch its breath". If I stopped the read workload, it'd be done in less than a day, I think. So there is no "right" number for how long a resilver should take.
 

AVB

Contributor
Joined
Apr 29, 2012
Messages
174
The estimated time is only an estimate, to actually figure it, you'd really need to traverse the pool and look at how many seek operations remain, which is a very I/O heavy thing to do, and is actually a large part of what needs to happen for the resilver anyways, so it tends to be not horribly accurate.

I don't see any sign offhand that you have SMR drives, but SMR drives are not advisable and can cause really weird performance issues, especially if you are replacing an old CMR drive with a new SMR drive. SMR drives should be considered nearly incompatible with ZFS.

Resilvering a pool of 11 8TB drives used to take about a day and a half to two days here, for large-ish file storage on RAIDZ3.

How long it takes on your pool is going to be a function of the types of data you're storing, how much load you're putting on it, and how bad the fragmentation is. For example, this is kinda pathetic:

Code:
 state: ONLINE
  scan: scrub in progress since Sat Nov 20 22:00:03 2021
        30.7T scanned at 20.0M/s, 29.0T issued at 18.9M/s, 61.3T total
        0 repaired, 47.40% done, 20 days 16:02:03 to go


where the pool is under constant read workloads of 100-200MBytes/sec, and never really gets a chance to "catch its breath". If I stopped the read workload, it'd be done in less than a day, I think. So there is no "right" number for how long a resilver should take.
Thanks for the input. I don't have any SMR drives they are all "Enterprise" class jn that pool. Your example of 11x8TB drives in a RaidZ3 set up is only 8TB smaller than I have in my pool so if our other hardware is similure then it looks like I have about another 8 hours to go or about 38-40 total.
 
Top