Slow scrubs and resilvers on a FreeNAS mini

Status
Not open for further replies.

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
Hi all,

We have a pair of FreeNAS Minis, made by iXsystems. One is primary, and the other one gets replicated to nightly . The current storage config on them is 4x6tb WD RED in a RAIDZ1, lz4 compression, no dedupe. We're running FreeNAS-9.3-STABLE-201512121950, and the Minis have 32GB of ram, and an Atom C2750. We had a disk go bad in one (the backup) and the resilver is taking a very long time (about five days so far):

zpool status
pool: backup
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Jan 5 17:45:18 2016
8.45T scanned out of 8.91T at 1.72M/s, 78h18m to go
2.11T resilvered, 94.80% done

Short SMART checks show the disks are good, and there have been no r/w/c errors during the rebuild. The CPU is loafing, and the load (as reported by top) is a pretty steady .5 . The system is using 6.5 TB and has 8.1 TB left.

Running iostat -xc 10 shows the %b (time busy?) on the three "good" disks as 100% a lot of the time, and the disk the is rebuilding is much lower. Does this indicate what I hope it doesn't, which is my disks are just working as fast they can, and there's nothing much I can do?

This isn't the first time a scrub or resilver has taken many days on one of these systems. Anyone have any insight into why, and what can be done to speed things up?

Warmest regards,

Jordan
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Is being actively used? ZFS will throttle the resilver to allow for client requests to be handled.

That's also a rather full pool. That's never good for performance. You might want to upgrade (at least the primary system) to something larger that can handle more drives, to allow for a larger pool. The backup should be able to move along with the shingled 8TB drives.
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
The system is not being actively used. Right now, the only thing it should be doing is rebuilding. I don't think the pool is that full either (probably could have been clearer in how I described it) . There is 6.5 TB used and 8.5. TB free, meaning I'm less than 50% utilized at the moment.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The system is not being actively used. Right now, the only thing it should be doing is rebuilding. I don't think the pool is that full either (probably could have been clearer in how I described it) . There is 6.5 TB used and 8.5. TB free, meaning I'm less than 50% utilized at the moment.
Ah, misread that as 6.5 free out of 8.5...

Well, the next likeliest option is a second dying drive. Have you monitored the drives properly with regular long tests?
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
No, I'm doing short only at this point (they report fine). I'll start doing weekly longs.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
I had one question. Did you buy the machines as diskless?
The reason I ask is that perhaps the mini's have the motherboards that have the Marvel controllers
and if you are using those SATA ports for your disks, this may have something to do with your issues
of slow speeds.
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
I'm not quite sure if this is the answer to the question you're asking. I bought the systems from iXsystems with disks included, but that just means the disks (wd red) are in the box, and we put them in the caddies, an loaded them into the system to use. So we have 4 WD RED 6TB disks (provided by iXsystems) in the system currently.
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
the %b on iostat makes me think that I'm just at the limit of what the system can do. There are (I'm told) lots of small files on these systems, but I don't know enough about scrubs and resilvers to be able to tell if file size has anything to do with rebuild and scan time.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
I'm not quite sure if this is the answer to the question you're asking. I bought the systems from iXsystems with disks included, but that just means the disks (wd red) are in the box, and we put them in the caddies, an loaded them into the system to use. So we have 4 WD RED 6TB disks (provided by iXsystems) in the system currently.
Yes, that tells me that you did not switch the data cables (inside the box) to different SATA ports before installing your drives.
Have you contacted iX Systems for help, or has the warranty expired?
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
Both systems are under warranty. I was under the impression that on the Minis the warranty was pretty bare bones (they'll replace components when they go bad), but you are right, I should give it a shot. I will, and will report back.

The Minis are really nice for our application, which is an office on the West coast that has very few people, moderate performance needs, but quite a bit of data that grows slowly, but steadily. I can even zfs send back to our freenas install on the east coast nightly, (got to love zfs send/recv).
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
I got in contact with support, and they were competent and friendly, and checked for hardware issues. They weren't able to find any, and that is about as far as freenas mini support goes. When I run a systat -vm, it shows the three healthy disks at about 100% utilization, and the rebuilding disk is pretty much idling. The system isn't really doing anything except rebuilding at this point, so my current conclusion (I hope I'm wrong and there is a config setting I can tinker with, but I can't find it) is that I've reached some sort of performance limit on this system config with the type of data we have on disk, and that rebuilds are going to take 6-7 days. I'm stuck with 6tb disks is a RAIDz1 for now, but when the 8tb WD RED come out (dunno when) I'll rebuild as a striped mirror (mirrored strip?) or raidz2, since 6-7 days rebuild is asking for trouble on a raidz. At least I have the two systems replicating to each other.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
the %b on iostat makes me think that I'm just at the limit of what the system can do. There are (I'm told) lots of small files on these systems, but I don't know enough about scrubs and resilvers to be able to tell if file size has anything to do with rebuild and scan time.

Yes. If you have lots of small files, that's probably it.

The way a ZFS scrub or resilver works is based on a metadata traversal of the pool, which means that it is effectively crawling the entire pool. This is unlike a traditional RAID5 controller which just does a linear traversal of the LBA's on the disk, without any knowledge (or need-to-know) of the layout of the filesystem.

This allows ZFS to go very fast on pools with large files and pools that aren't particularly full, but it stinks for lots of small files.

Your next suggestion might be that this is a bug in ZFS. Yes and no. If you look at the way RAIDZ works, for example, it becomes clear that this is a much more complicated issue due to the clever way ZFS stores its RAIDZ data. It's necessary to have some mechanism to identify which sectors in use, where the data blocks are, and to compute the parity as needed. We get a lot of benefits from RAIDZ but this isn't one of them.
 

redoak42

Dabbler
Joined
Jan 10, 2016
Messages
19
Thanks for the explanation/confirmation jgreco. I'll take the slow rebuild and scrub times with small files and work around them for the other benefits ZFS provides.
 
Status
Not open for further replies.
Top