FRAG

orddie · Sep 4, 2017

Hi everyone,

for the last 200 some odd days my FreeNAS is reporting as being up, it's been rock solid. Today when doing patches on my windows host i noticed slow logins.

had a look at vmware and noticed read / write times above 10 where they have been around 3. Not a huge deal (or so i think for i have looked at worse).

Had a look at the output of zpool list and see the vmware pool is 45% FRAG.

Code:

NAME		   SIZE  ALLOC   FREE  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
OneTbMirror	928G  18.0G   910G		 -	 1%	 1%  1.00x  ONLINE  /mnt
Vmware		4.34T   348G  4.00T		 -	45%	 7%  1.00x  ONLINE  /mnt
freenas-boot  14.8G  3.04G  11.7G		 -	  -	20%  1.00x  ONLINE  -

I do not use SnapShots.
The FreeNAS server is connected via 10BGE links to each VmWare host.
I have two VmWare hosts connected to the FreeNas server.
FreeNAS has 32GB ram and Intel i3-6100

honestly i would like the Frag% to be lower.

What are my options to lower the Frag%
what % should one really be worried?
is there an ONLINE process to deFrag the disks?

Thanks!

Chris Moore · Sep 4, 2017

orddie said:
Hi everyone,

for the last 200 some odd days my FreeNas is reporting as being up, it's been rock solid. Today when doing patches on my windows host i noticed slow logins.

had a look at vmware and noticed read / write times above 10 where they have been around 3. Not a huge deal (or so i think for i have looked at worse).

Had a look at the output of zpool list and see the vmware pool is 45% FRAG.
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT OneTbMirror 928G 18.0G 910G - 1% 1% 1.00x ONLINE /mnt Vmware 4.34T 348G 4.00T - 45% 7% 1.00x ONLINE /mnt freenas-boot 14.8G 3.04G 11.7G - - 20% 1.00x ONLINE -

i do not use SnapShots.
the FreeNas server is connected via 10BGE links to each VmWare host.
I have two VmWare hosts connected to the FreeNas server.
FreeNas has 32GB ram and Intel i3-6100

honestly i would like the Frag% to be lower.

What are my options to lower the Frag%

what % should one really be worried?

is there an ONLINE process to deFrag the disks?

Thanks!

That fragmentation is very low, I would not expect that to affect your system performance. Can you show us your zpool status as that might be more informative.

m0nkey_ · Sep 4, 2017

Just an FYI. Fragmentation is the fragmentation of free space, not used space.

https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationMeaning

Ericloewe · Sep 4, 2017

Chris Moore said:
That fragmentation is very low

I wouldn't call 45% "low", but it's certainly in line with the expectations of block storage workloads. It's an unfortunate effect that you have to deal with and mitigate as needed (more vdevs, more free space, zfs send/recv to a new dataset to defrag things a bit, ...).

orddie said:
is there an ONLINE process to deFrag the disks?

No, that's the holy grail of ZFS, Block Pointer Rewrite.

If you can think of a way of Rewriting a block on disk in an atomic operation even when it's referenced by multiple different other blocks and keeping the whole "find out which blocks reference this one" problem in a reasonable amount of temporary storage, you will get the attention of the ZFS community and be showered with money and other perks. Okay, the money part is a lie, but you'd be the hero of many people. Fighting crime with superpowers is easy, rewriting block pointers in ZFS is hard!

Chris Moore · Sep 4, 2017

The thing that I found that reduces fragmentation is if you pull a disc out, clear it in another operating system and then put it back in. When you resilver, magically fragmentation is reduced.

Sent from my SAMSUNG-SGH-I537 using Tapatalk

Ericloewe · Sep 4, 2017

Chris Moore said:
The thing that I found that reduces fragmentation is if you pull a disc out, clear it in another operating system and then put it back in. When you resilver, magically fragmentation is reduced.

That can't be right... RAIDZ blocks are implicitly defined by the layout of the vdev, so they have to end up on the same place on disk. At least that's my understanding.

Chris Moore · Sep 4, 2017

Ericloewe said:
That can't be right... RAIDZ blocks are implicitly defined by the layout of the vdev, so they have to end up on the same place on disk. At least that's my understanding.

I just replaced 7 of the 12 disks in my Emily-NAS with new disks to replace disks that had gone over 5 years of age. In that system, this did not equate to replacing all the disks in either of the two vdevs, only the disks that had gone over the 5 year mark. Before I started the process, the fragmentation was around 50%. It took a little time to complete the project as I replaced one disk a night over the course of a week, but when I was done the fragmentation was much lower and is currently at 7%.
In that case I was replacing old 2TB disks with new 2TB disks.

In my Irene-NAS, I recently went through and replaced all the 2TB disks in one of the vdevs with 4TB disks. Again, I replaced one disk a day over the course of a week. The fragmentation had been around 50% before the process started and, on the Irene-NAS, t is now at 9%.

I don't pretend to know how it is supposed to work, because I have not looked at the source code, but I can tell you what I have seen it do.

Stux · Sep 4, 2017

Moving data betweeen datasets (or in the case of a zvol, replicating) will defragment.

curious about @Chris Moore's results.

If OP has a RaidZ2 he could replicate, by offlining a drive and self-replacing all the drives one at a time with each other (offline, wipe, replace)

Ericloewe · Sep 4, 2017

Stux said:
curious about @Chris Moore's results.

Yeah, I have to ask around, because it sounds like a bug to me.

Ericloewe · Sep 4, 2017

So I just asked. In a nutshell, what I said was right: the data ends up exactly as it was before the resilver, so the real fragmentation is exactly as it was.

As for the fragmentation number... It's complicated. It's a weird heuristic that gives a sorta-useful number, based on free space fragmentation. A few possibilities suggested by Allan Jude (forgive the forum not linking to the user page):
User "error":

Deleting some stuff would quickly defragment free space (expiring snapshots included)

Less-than-ideal behavior:

Some weird interaction on older pool versions (particularly if they predate the fragmentation number) that were upgraded and ends up being fixed by the resilver, somehow
A real, current bug

Chris Moore · Sep 4, 2017

Ericloewe said:
So I just asked. In a nutshell, what I said was right: the data ends up exactly as it was before the resilver, so the real fragmentation is exactly as it was.

As for the fragmentation number... It's complicated. It's a weird heuristic that gives a sorta-useful number, based on free space fragmentation. A few possibilities suggested by Allan Jude (forgive the forum not linking to the user page):
User "error":

Deleting some stuff would quickly defragment free space (expiring snapshots included)

Less-than-ideal behavior:

Some weird interaction on older pool versions (particularly if they predate the fragmentation number) that were upgraded and ends up being fixed by the resilver, somehow

A real, current bug

If I recall correctly, I created these pools under FreeNAS 9.3. It was around Feb or March of 2016 when I changed the vdev layout and it has been running continuously since then with just upgrades to the OS.

Ericloewe · Sep 4, 2017

Well, that rules out the ancient pool option.

rs225 · Sep 4, 2017

I think the frag number was discussed in a prior thread. The conclusion is that the frag number can't be what you would assume it to be. In the case of Chris Moore, resilver can't defrag a disk, but it might be doing something else that shifts how that frag number is computed at that moment for the pool.

orddie · Sep 5, 2017

Chris Moore said:
That fragmentation is very low, I would not expect that to affect your system performance. Can you show us your zpool status as that might be more informative.

 

[root@freenas] ~# zpool status

  pool: OneTbMirror

 state: ONLINE

  scan: scrub repaired 0 in 0h11m with 0 errors on Sun Aug 27 00:11:10 2017

config:



		NAME											STATE	 READ WRITE CKSUM

		OneTbMirror									 ONLINE	   0	 0	 0

		  mirror-0									  ONLINE	   0	 0	 0

			gptid/bbfd7c35-ee29-11e6-9c87-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/bc9ae966-ee29-11e6-9c87-001b21a7c63c  ONLINE	   0	 0	 0



errors: No known data errors



  pool: Vmware

 state: ONLINE

  scan: scrub repaired 0 in 5h16m with 0 errors on Sun Sep  3 05:16:52 2017

config:



		NAME											STATE	 READ WRITE CKSUM

		Vmware										  ONLINE	   0	 0	 0

		  raidz1-0									  ONLINE	   0	 0	 0

			gptid/085308b8-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/08c4300a-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/093035f1-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/09a0b52f-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

		  raidz1-1									  ONLINE	   0	 0	 0

			gptid/0a177a19-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/0a8970c4-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/0afcdd22-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0

			gptid/0b6eb32d-edf9-11e6-b1ab-001b21a7c63c  ONLINE	   0	 0	 0



errors: No known data errors



  pool: freenas-boot

 state: ONLINE

  scan: scrub repaired 0 in 0h2m with 0 errors on Wed Aug  2 03:47:34 2017

config:



		NAME										  STATE	 READ WRITE CKSUM

		freenas-boot								  ONLINE	   0	 0	 0

		  gptid/87474e74-b21a-11e6-a53f-001b21a7c63c  ONLINE	   0	 0	 0



errors: No known data errors

Important Announcement for the TrueNAS Community.

FRAG

orddie

Contributor

Chris Moore

Hall of Famer

m0nkey_

MVP

Ericloewe

Server Wrangler

Chris Moore

Hall of Famer

Ericloewe

Server Wrangler

Chris Moore

Hall of Famer

Stux

MVP

Ericloewe

Server Wrangler

Ericloewe

Server Wrangler

Chris Moore

Hall of Famer

Ericloewe

Server Wrangler

rs225

Guru

orddie

Contributor

Similar threads

Important Announcement for the TrueNAS Community.

FRAG

Contributor

Hall of Famer

MVP

Server Wrangler

Hall of Famer

Server Wrangler

Hall of Famer

MVP

Server Wrangler

Server Wrangler

Hall of Famer

Server Wrangler

Guru

Contributor

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "FRAG"

Similar threads