Odd ARC Memory Usage Behavior in 13.0-U4

dak180

Patron
Joined
Nov 22, 2017
Messages
310
@morganL suggested that I make a separate thread for this:
Here is an illustration of what happens with the zfs arc in u4 for me, is any one else seeing anything like this? (the gap is the reboot for upgrade):

Screen Shot 2023-03-03 at 12.05.50 AM.png

You can also see that though there is no memory pressure the arc target drops during periods of light activity:
Screen Shot 2023-03-02 at 11.28.40 AM.png

And here is a graph of the last 8 days showing it bouncing around:
Screen Shot 2023-03-04 at 11.02.26 AM.png
 
Last edited:
Joined
Oct 22, 2019
Messages
3,641
but it may be masked by the optane l2arc
I didn't know you had an L2ARC.

That might in fact explain the phenomenon.

Do you notice a correlation of the amount in your secondary cache (L2ARC) to the amount that is removed from the primary cache (ARC)?
 

dak180

Patron
Joined
Nov 22, 2017
Messages
310

dak180

Patron
Joined
Nov 22, 2017
Messages
310
@morganL, @winnielinnie one thing that I did notice that correlates to u4 is a big change to the MRU to MFU ratio for arc:
Screen Shot 2023-03-04 at 2.35.03 PM.png
 

Volts

Patron
Joined
May 3, 2021
Messages
210
Do you notice a correlation of the amount in your secondary cache (L2ARC) to the amount that is removed from the primary cache (ARC)?

ARC data isn't evicted "to" the L2ARC. ARC eviction shouldn't be considering L2ARC at all - right?

But I'm also curious why the ARC suddenly chose a much lower size target. I wonder if something else demanded system memory *momentarily* that wasn't captured in the graphs.

Do you see a reduction in ARC hit rate when the ARC size drops, or does it stay consistent?

How is this system used? File server? Small files or large? Other apps on it?

There were recent MFU/MRU changes. Depending on the file access pattern, I'm very curious if that could be related. https://github.com/openzfs/zfs/issues/14120
 

dak180

Patron
Joined
Nov 22, 2017
Messages
310
Do you see a reduction in ARC hit rate when the ARC size drops, or does it stay consistent?
Yes, it drops from the 90s to the 80s.

How is this system used? File server? Small files or large? Other apps on it?
It is used as a personal file and backup sever that also runs a few jails. At 2am on the time chart it starts replication from a remote server to backup dataset.

If it will help I have netdata snapshot file for the time in the images; it is almost 10mb compressed, so not sure if the forum will let me upload it.
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Hi everyone,

I'm also seeing this behavior since upgrading to 13.0-U4, albeit in somewhat vaguer terms because I haven't done much more research than just watching my free RAM, as reported by the TrueNAS CORE GUI.

In any case, I have 64GiB of RAM, and since the upgrade from 13.0-U3.1 I have definitely seen my free RAM shoot up from around 1GiB to around 23GiB, with all of that no longer in use by the ZFS ARC, even after almost 2 full days since the reboot (and I have 5 jails constantly running with ZFS-backed data mounts, SMB and two Macs doing TimeMachine backups to at least two separate ZFS datasets, ZFS snapshot & replication tasks across local pools, a zvol-based VM, and other minor things). Prior to 13.0-U4, a few hours after a reboot would be more than enough for the ARC to claim most of my free RAM for itself, so the behavior is definitely different now.

On the other hand, I can't really say I've seen any degradation in performance, and my ARC hit ratio remains pretty high hovering around 100% almost all of the time, with minor dips here and there that don't seem to even reach 99.5%, which probably says that the ARC, however large, is indeed doing its job.

I don't have any L2ARC.

Needless to say, let me know if you'd like me to fix my attention on any other metric.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
It certainly seems to be a change in behavior within ZFS.

The question is whether its poitively or negatively impacts real-world behaviour?
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Absolutely a change in ZFS behavior, and so far it doesn't seem to be a bad one, at least (but definitely at the personal cost of having very recently purchased my last RAM DIMM, precisely, uuupppsss, to increase ARC size ;)

What OpenZFS version was TrueNAS CORE 13.0-U3 using? 2.1.7 includes the following two changes:


So perhaps they explain the reduction in ARC size under the current 2.1.9? Or were we already running 2.1.7 under TrueNAS CORE 13.0-U3?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Absolutely a change in ZFS behavior, and so far it doesn't seem to be a bad one, at least (but definitely at the personal cost of having very recently purchased my last RAM DIMM, precisely, uuupppsss, to increase ARC size ;)

What OpenZFS version was TrueNAS CORE 13.0-U3 using? 2.1.7 includes the following two changes:


So perhaps they explain the reduction in ARC size under the current 2.1.9? Or were we already running 2.1.7 under TrueNAS CORE 13.0-U3?
Openzfs 2.1.7 was merged in 13.0-u4 and SCALE 22.12.1
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Indeed! On 13.0-U3.1 we were running 2.1.6:

-> strings /mnt/13.0-U3.1/usr/local/lib/libzfs.so.4.1.0 | grep -i -e 'zfs-[0-9]\.+'
zfs-2.1.6-1


So those two changes were definitely not present its OpenZFS distribution.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Indeed! On 13.0-U3.1 we were running 2.1.6:

-> strings /mnt/13.0-U3.1/usr/local/lib/libzfs.so.4.1.0 | grep -i -e 'zfs-[0-9]\.+'
zfs-2.1.6-1


So those two changes were definitely not present its OpenZFS distribution.

Uh, it is simpler to find ZFS version than that:
Code:
root@truenas[~]# zfs --version
zfs-2.1.6-1
zfs-kmod-2.1.6-1


The zpool command also works, but is 2 letters longer;
Code:
root@truenas[~]# zpool --version
zfs-2.1.6-1
zfs-kmod-2.1.6-1


Oh, no! I have outed myself as a Unix SysAdmin! Using a command that is 2 characters shorter to save typing! The sin, no one should know that. Now I must hide myself from the crowds of villagers that will want to burn me at the stake!
 
Last edited:

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Uh, it is simpler to find ZFS version than that:
Code:
root@truenas[~]# zfs --version
zfs-2.1.6-1
zfs-kmod-2.1.6-1


The zpool command also works, but is 3 letters longer;
Code:
root@truenas[~]# zpool --version
zfs-2.1.6-1
zfs-kmod-2.1.6-1


Oh, no! I have outed myself as a Unix SysAdmin! Using a command that is 3 characters shorter to save typing! The sin, no one should know that. Now I must hide myself from the crowds of villagers that will want to burn me at the stake!
I was booted into my current 13.0-U4 installation, and I didn't care to boot back into 13.0-U3.1 to find its OpenZFS version :tongue: And if I run the command from its temporary mount path, the wrong version will still be printed, as a result of the zfs library that's loaded by the linker:

-> sudo zfs set mountpoint=/mnt/13.0-U3.1 freenas-boot/ROOT/13.0-U3.1
-> sudo zfs mount freenas-boot/ROOT/13.0-U3.1
-> /mnt/13.0-U3.1/usr/local/sbin/zfs --version
zfs-2.1.9-1
zfs-kmod-v2023012500-zfs_9ef0b67f8

-> ldd /mnt/13.0-U3.1/usr/local/sbin/zfs
/mnt/13.0-U3.1/usr/local/sbin/zfs:
libzfs.so.4 => /usr/local/lib/libzfs.so.4 (0x800263000)
(…)
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Ah, then that makes sense.


I also put it in for others reading this thread that may want the ZFS version. At one point, your method or something similar was the way to get the version. Then someone proposed the --version option.
 
Joined
Oct 22, 2019
Messages
3,641
I'm trying to wrap my head around this.

Is the "TL;DR" summary version...
  • Upstream change in ZFS for version 2.1.9
  • TrueNAS Core 13.0-U4 inherits ZFS 2.1.9
  • Different eviction pressure and algorithm for MFU vs MRU
  • This new method is an "improvement", even though at first glance it appears to be a regression?
  • Somehow, now having large amounts of "unused" RAM (rather than used for the ARC) is a good thing?

It's the last part I don't quite understand. MFU vs MRU, what difference does it make? Either of these being held in RAM is surely better than not using your RAM? If the response is "Yeah, but it's better to have unused RAM so that there's always room to fill it with higher hit-rate data." I thought that the ARC always allowed for swift eviction to make room for data that gets more hits? We're apparently not even seeing this unused RAM being filled up immediately. So what "pressured" the existing data to be kicked out in the first place?
 
Last edited:

dak180

Patron
Joined
Nov 22, 2017
Messages
310
It certainly seems to be a change in behavior within ZFS.

The question is whether its positively or negatively impacts real-world behavior?
Not that I have noticed but with my persistent L2ARC that might mask things; @Juan Manuel Palacios, not having L2ARC would be in a better position to speak to this.

I thought that the ARC always allowed for swift eviction to make room for data that gets more hits? We're apparently not even seeing this unused RAM being filled up immediately. So what "pressured" the existing data to be kicked out in the first place?
Having looked through the changes (very briefly) my impression is that ARC now goes much more out of its way to maintain a 1 to 1 ratio between MFU and MRU to the point of (what I would consider) premature evictions from one or the other in order to maintain a balance.

By premature I am defining that by there being free ram that is not being used for other things.
 
Joined
Oct 22, 2019
Messages
3,641
By premature I am defining that by there being free ram that is not being used for other things.
That's exactly my concern. I don't understand how this is an improvement.

I'm familiar with the Laozi proverb "A bowl is most useful when it is empty", but this is ridiculous. :tongue:
 

kspare

Guru
Joined
Feb 19, 2015
Messages
508
Just to add to this…

I run 4 servers with 1tb of ram, all running U2 currently. I upgraded on server to U4.

my U2 servers over the weekend, so things being idle, are all only at about 60-75% arc utilization.
my U4 is full, but vms also migrated back on.
running iscsi.
12 800gb sas ssds for l2 arc.

what are you using to get that graph? I’d love to contribute to this.
 
Top