Defragging a Raid ZFS set of drives?

Status
Not open for further replies.

Piggie

Dabbler
Joined
Jul 2, 2011
Messages
26
Bit of a noob question, but I've not seen this mentioned before.

Given the MASSIVE size of many raid arrays under ZFS, with terabytes of files, many files perhaps being moves around across the array of drives, over days, weeks, months.

What happens about fragmentation?

Under windows you generally run a defrag program every so often to sort things out.

So how does a FreeNAS system deal with this issue?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Do a google search on ZFS fragmentation, it appears like its a problem that's being worked on. Since FreeNAS has not yet caught up to the current release of ZFS and the features needed to implement defragmentation have not been developed for the current version of ZFS yet, it doesn't appear there's a version of ZFS that can be defragged...
 
Joined
May 27, 2011
Messages
566
you can't defrag ZFS. nor do you want too. ZFS is not designed to a single user file system, you want the file chunks spread out and full of holes. it reduces the latency for read and write operations and allows for them to be interspersed well. remember hard disks have only one position they can read/write from, if you defrag, you'll have large spots of disk that cannot be written too. when you have many users all reading and writing at the same time, write cannot be almost concurrent, only one user will get good access at a time. it comes down to statistics, having defragged data gives you great access for a single user but terrible access for multiple users.

i hope that all makes sense.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
This is an interesting link about ZFS fragmentation in a commercial environment affecting database performance. It's a bit technical, but it talks about database performance issues (lots of users) with heavy database usage on ZFS. Which probably doesn't apply to most of the people here, but it does point out that fragmentation can be an issue on ZFS.

http://wildness.espix.org/index.php?post/2011/06/09/ZFS-Fragmentation-issue-examining-the-ZIL

@matt... your explanation makes sense, but there are ZFS applications where people are experiencing problems with fragmentation on ZFS that hadn't been anticipated or documented that well until recently.
 
Joined
May 27, 2011
Messages
566
zfs was release in 2005 and in 2007 pages dedicated to tuning zfs for databases started popping up so it's not a new problem. zfs is copy on write so it's not exactly great for databases in general, you can tweak it to work better, but databases don't like COW file systems. there are better choices for file systems, especially if you're expecting 50k write iops


but back to the original poster, FreeNAS and more specifically ZFS does not currently do any fragmentation. One thing to keep in mind, you want to keep about 20% of your pool free. filling past that will result in decreased performance.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
There are scenarios where you do want to defrag ZFS, and they're not just database apps. We've seen problems on most filesystems with Usenet news overview databases. Due to the nature of the task, there's only a small set of files active for writing, but large quantities of read-only data being accessed in sequential nature. It is desirable to avoid fragmentation on those reads, as it affects overall system performance. Periodically copying the data onto a fresh filesystem cures the problem for a while. Larger amounts of freespace help for many filesystems.

Matthewowen's earlier post seems to suggest that defrag will move all your data into a single blob. Maybe MS-DOS did that, but that's clearly undesirable for most UNIX applications. A good UNIX defrag for UFS/FFS, ideally, would work to ensure that data blocks for a file were in the same cg as the directory in which they're held, and contiguous within reason. Such optimization would work well with the normal advice for stripe sizing to make the stripe size equal to a cg, which allows a single hard drive to do the directory lookup AND data accesses when accessing a file, which increases the potential for many simultaneous accesses on a filesystem. Locality and all that.
 

Piggie

Dabbler
Joined
Jul 2, 2011
Messages
26
Many thanks to all for the interesting replies to my original question.
 
Status
Not open for further replies.
Top