Tr
The general problem is that for it to make a difference, the workload has to be such that the ARC would be primed with the data in a timely fashion, such as by having recently written the data to the pool, or readahead. Using large record sizes such as 1M helps if the system has to retrieve the data off the pool, because it has to read the entirety of the record up front, and then has the remainder of the record in ARC for rapid access. My basic gut says that adding RAM will be helpful but also disappointing.
For some additional "light reading" of the deeper dive variety, you can find out more about prefetch:
(which may be too old to be relevant)
Trying to fiddle around and get smarter with all of this, I came across this on another forum. I was thinking of something along the same lines and it looks like it exists…
I have multiple datasets, largest being my Plex collection which has 0 benefit from being cached. Things are written to that dataset, rarely are ever read, and when they are…. They are read at whatever bitrate the file is (typically 4k, but still, that’s easy for a harddrive to spit out). So I figured I should try and optimize my ARC for the datasets I DO actually interact with on my PC.
From serverfault:
That said, you can specify how ARC/L2ARC should be used on a dataset-by-dataset base by using the primarycache and secondarycacheoptions:
- zfs set primarycache=none <dataset1> ; zfs set secondarycache=none <dataset1>will disable any ARC/L2ARC caching for the dataset. You can also issue zfs set logbias=throughput <dataset1> to privilege througput rather than latency during write operations;
- zfs set primarycache=metadata <dataset2> will enable metadata-only caching for the second dataset. Please note that L2ARC is feed by the ARC; this means that if ARC is caching metadata only, the same will be true for L2ARC;
- leave ARC/L2ARC default option for the third dataset.
Finally, you can set your ZFS instance to use more than (the default of) 50% of your RAM for ARC (look for zfs_arc_max in the
module man page)
First question here… any idea what sort of impact this may have? Theoretically, I’d reduce the size of what I care about in ARC down from ~14TB to 4.5 TB if I removed the Plex library,
and is the default ARC only 50% on BSD truenas CORE? If so, I’d happy allocate more to ARC since I run nothing inside of truenas - it is purely just a file handler. I run all services externally in other VM’s all under proxmox. I used to run this system on 16GB of RAM, now that it has 30…. I can’t see why dedicating ~22GB specifically to ARC (or more?) would be an issue. Any idea what the “minimum” would be to remain as not ARC specific? I don’t run iSCSI, I don’t have dedupe, and I have the basic LZ4 compression. I have 1 NFS mount and some SMB shares…. Pretty basic setup.
*Edit, doing some more reading I saw this note on the "zfs set primarycache": When these properties are set on existing file systems, only new I/O is cache based on the values of these properties
Since I am trying to remove a dataset from ARC,
1: is that persistent across reboots?
2: Do I need to somehow "force" the old ARC data from that dataset out? Or would a simple reboot of the system force it to repopulate ARC based on usage, and it won't attempt to fill the ARC up with any data from the dataset I excluded?