Can the record size of a dataset be reduced on the fly?

vitaprimo

Dabbler
Joined
Jun 28, 2018
Messages
27
I've a tiny TruNAS system exclusively for VMDK storage over NFS. The inflated thin VMDK don't go beyond 1TB, so it's only a basic mirrored single mirrored VDEV plus a a matching flash mirrored VDEV for metadata. The flash pool is relatively massive compared to the spinning pool.

Searching for how to enable special_small_blocks storage on the metadata VDEV instead I learned that I had another issue to fix first, the record size. I've been reading for hours articles and blog post and most of them point towards a 16K-ish record size as a more appropriate for generic VMs (database storage is elsewhere but there's a low traffic Exchange Server--circular logging enabled). I dug around the settings for the pool and the datasets and found that the record size is not greyed out like it is in some of the other settings of a dataset.

And sure, I found it can be changed but I don't know if it's on the fly. I do know that it won't grow the record size by itself: data must be rewritten--I'm not growing anything though so I think I'm on the clear for that, but I did find something about shrinking it would trigger the rewriting of data, it was called differently something like send/receive, there's were also warnings about something-something involving zdb and but it wasn't all that clear how to proceed in the first place. I lost the page buried in one of the browser windows with way too many tabs.

Oh, and [speaking of zdb] this one command string I found to size up metadata no longer works. I have a feeling it's because my only pool is named "z" as in /mnt/z/someDatasetOrZvol, I now get an error about some cache, or that the thing I'm referencing (z) doesn't exist. It not that concerned about that anyway, the record size thing is stealing my attention. :/​

Can I safely reduce the record size of an existing dataset? If, what are the next steps so data blocks are broken apart into smaller ones? I created a new dataset with the 16K desired size but before moving files--thankfully only like 200GB-worth of VMDKs--I though it might be a good idea to ask first, hopefully I can offload it from the network. Isolated, but still. :)
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The new recordsize value will only apply to new records written - so over time, the natural copy-on-write behavior will churn it into smaller pieces. You can always just do svMotion to force this issue, of course.

But recall that recordsize is also a maximum, so you may not have as many large records as you think. Running a specific zdb command from a shell will give you some more info about your record distribution; should be zdb -LbbbA -U /data/zfs/zpool.cache poolname (edit: that should be "zpool.cache" - I'm phoneposting)

Final thought, I've found 32K to work fairly well for generic VMFS as far as the balance between good performance and compression efficiency. Only you know your data though.
 

vitaprimo

Dabbler
Joined
Jun 28, 2018
Messages
27
OMG, it was you one of the people involved in these posts I found! I recognized the green in the picture. So you just helped me yet another of so many times today. Thanks!

I tried the command you and this time it worked (and I saved it in my notes for future reference), I guess I was doing it wrong before:

Code:
 2.78M   189G   86.7G   87.0G   31.3K    2.18   100.00  Total

Block Size Histogram

  block   psize                lsize                asize
   size   Count   Size   Cum.  Count   Size   Cum.  Count   Size   Cum.
    512:    739   370K   370K    739   370K   370K      0      0      0
     1K:    605   706K  1.05M    605   706K  1.05M      0      0      0
     2K:    681  1.86M  2.92M    681  1.86M  2.92M      0      0      0
     4K:   313K  1.22G  1.23G    393  1.98M  4.89M   308K  1.20G  1.20G
     8K:   816K  8.05G  9.27G    648  6.28M  11.2M   812K  7.99G  9.19G
    16K:   882K  15.4G  24.7G  1.46M  23.3G  23.3G   891K  15.6G  24.8G
    32K:   339K  14.6G  39.3G  17.0K   548M  23.9G   340K  14.7G  39.5G
    64K:   293K  24.1G  63.4G    176  15.5M  23.9G   292K  24.1G  63.6G
   128K:   187K  23.3G  86.7G  1.29M   165G   189G   187K  23.4G  86.9G
   256K:      0      0  86.7G      0      0   189G    324  91.9M  87.0G
   512K:      0      0  86.7G      0      0   189G      0      0  87.0G
     1M:    


According to yourself in the post that now I found and lost again the ASIZE column is a good indicator to follow unless I'm on RAIDZ (which I'm not) because it's artificially inflated, I'm paraphrasing, but even so it was still good. I read it several times to make sure. It seems the 16K size might just be the right fit.

In any case, the fact that this can be set per dataset (I hope I didn't misunderstood ) and even the small files thing on metadata VDEVs is on a per dataset basis, that makes it super flexible, all of it. I'm moving the biggest, disk-access-busiest VM, most problematic and most resource-intensive VM to this 16K dataset, a Windows Server (no surprise there) RDS RemoteApp server to see how it performs there.

I'm using Live Storage vMotion so the machine that's always in the troublemaker VM lists in Operation Manager is constrained even further. It has taken a long time, it doesn't speak well so far for 16K, your 32K prediction might actually be dead-on. Seeing Windows stats even against the most bloated of UNIX systems makes Windows seem almost endearing now, like when you remember someone you knew as a child how they would throw tantrums.

Anyway,
Thanks again for your help!
 
Top