Low ARC Hit ratio, time for L2ARC?

Status
Not open for further replies.

vrod

Dabbler
Joined
Mar 14, 2016
Messages
39
Hi all,

I currently have a storage system running in a DC. It's a rather newly built system with 2 pools:

- 12x6TB (2 vdevs of RAIDZ2) 48TB pool mainly hosting media files but also diskless booting systems as well as backups - 2x S3700 as ZIL
- 12x2TB (6 mirrored vdevs) 12TB pool for VMs. - Optane 900p as ZIL

Server has 192GB of memory but yet, I am still seeing a ARC ratio of under 40% on average. sometimes it spikes but mostly it stays under 50%. Would now be a good time to get an L2ARC device? I have a 400GB Intel 750 SSD which I am not using. Server can still hold 64GB more of memory but I will have to buy the DIMMs first. I know people's usual opinion would be to max out the memory first but would it be worth a try to just install the NVMe SSD for now?

Thank you all in advance!
 
Joined
Feb 2, 2016
Messages
574
How is your performance? Happy? Unhappy?

RAM first. L2ARC takes away from RAM because it uses RAM. So, in many cases, adding L2ARC will reduce performance because it removes RAM capacity and RAM is substantially faster than even the fastest NVMe SSD.

If your ARC hit ratio is low, ask yourself what data you think is used most and may be cache worthy. Media - vidoes? - sounds like something that is large - larger than even a 400GB L2ARC would hold - and rarely reread before being dropped from cache. Diskless booting, too, sounds like something that would be read then dropped from cache before needing to be read again.

If you have good performance metrics and can conduct a test before and after adding L2ARC, sure, throw it in there and see what happens. But, if you're just relying on theory and have no way to tell if performance is better with the Intel 750 inline, well, that seems like a waste of time.

Cheers,
Matt
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Your ARC statistics should have a figure for misses that would've been hits if you had a bit more ARC. If that's high you would benefit from more ARC/L2ARC
 

John Doe

Guru
Joined
Aug 16, 2011
Messages
635
how long is the system running since set up?
 

PhilipS

Contributor
Joined
May 10, 2016
Messages
179
A 40% hit rate is low?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338

vrod

Dabbler
Joined
Mar 14, 2016
Messages
39
My apologies to everyone for not coming back to this earlier. I've been busy with wedding preparation so my (now wife) needed some help :D

About the performance, I am mixed. Sometimes the VMs work pretty fast, other times they can seem relatively slow. I guess that the ARC is working its angle when things go fast.

Currently, the system has been running for 37 days. Between now and back when I posted this topic, I have only changed a failed hdd. The ARC ratio hasn't changed. I looked into the hit ratio, I suppose you meant the graph for "ARC Requests (demand_metadata)" yes? In this graph, in average, I see about 1,2K requests, however just 238 hits. This is on a weekly basis. Changing it to daily or hourly shows fewer hits...

So, if I'm not completely mistaken, it means that the ARC is only being hit about 20% of the time? That isn't too good then, is it? :D

Here's a pic of the weekly performance for the ARC, maybe that gives you a better idea of how my system performs: https://cloud.vrod.dk/index.php/apps/gallery/s/AAyNIpfRRaWTdO7
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I see a 70% or so hit ratio. Top graph.

That’s prettt good isn’t it?
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338

vrod

Dabbler
Joined
Mar 14, 2016
Messages
39
Here's the freenas version: FreeNAS-11.1-U4

Not much has changed, still have the 750 ssd laying around and will probably install it next week as a l2arc.
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340
I had about the same level of performance before I ended up doing two things (that are not directly related):

1. Added a couple of stripped SSDs for L2ARC
2. Used autotune to correct a stability problem (FreeNAS was crashing under heavy iSCSI/Fibre Channel load)


I after adding L2ARC (first a single SSD, now stripped SSDs), I still wasn't terribly impressed with the result, still had low numbers. At the same time I had a stability issue which led to using autotune to restrict a few things enough not to crash the system. With some forum help, we supposed that the system was running itself out of memory when I did simultaneous storage motions and stuff like that.


In the end, doing both resulted in better performance (slightly noticeable to the virtual environment) and stability for my environment.

Note: Autotune can also result in adjustments that do not benefit or hurt your system performance, so maybe just consider researching adding the L2ARC first, and if you still have low hit ratios, research if you want to enable/disable prefetching, or other slight adjustments. Just be cautious, once you have a few tunables in there, if you have any problems, it's hard to tell if one of those is the cause or not. You end up not being a standard system anymore and it's a touch harder to support/troubleshoot another performance issue since you wont know if it was a tweak you made causing it or not. **All that said, in my case, autotune helped my overall situation, so I'm not complaining, it's something to consider testing. You can always remove the settings it adds, and reboot. I ended up removing a lot of stuff it added regarding networking I didn't think was needed.


My system's result under normal/light load with about 20 VMs:
upload_2018-5-31_7-30-19.png


You can see I still face low hits when the workload wants abnormal stuff that isn't cached.
 

Attachments

  • upload_2018-5-31_7-31-39.png
    upload_2018-5-31_7-31-39.png
    53.6 KB · Views: 379

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@vrod You have not listed the uses of your FreeNAS system which I would think is an important part when considering an L2ARC. If you are running multiple VM's, iSCSI, or just running a Media Server, this is important to know.

As @Ericloewe stated, it depends on what you are doing.

If you really feel like an L2ARC will help you out then go ahead and add one and see if it makes a difference. You can always remove it later. The problem I see is you should have some sort of benchmark to compare to. Maybe the benchmark is one perticular VM you run just takes a long time to open (with consistancy) and you can measure that with a stop watch (yes, I'm old). Do not use the ARC hit rate alone as a result of how great the system is working, you really need to know what data isn't being cached and if that is reasonable. If it really needs to be cached then maybe you will need more RAM and a large L2ARC. So, post what the system is being used for and maybe you will get some specific advice for you situation.

Cheers!
 
Joined
Feb 2, 2016
Messages
574
My system's result under normal/light load with about 20 VMs:
FreeNAS Platform: (x2) Intel(R) Xeon(R) CPU E5200 @ 2.27GHz

If you're happy with performance, keep doing what you're doing. If not, with 20 VMs hitting 60 drives across four HBAs and two NICs, you're bus is doing some heavy lifting and it may be time you upgrade your motherboard and related components.

I'm usually the guy saying "you've got too much CPU, it's wasted on FreeNAS, save your pennies". In this case - holy cow! - that's an ancient (2008), weak CPU. The Xeon E5200's front-side bus (FSB) runs at 800mHz. A current (or even two generation old) Xeon E3 is going to have TEN TIMES the bus speed. Getting data off the drives is only one component of throughput. If your bus us running at a snail's pace, there is only so much your drives can do.

Cheers,
Matt
 

vrod

Dabbler
Joined
Mar 14, 2016
Messages
39
Thank you all for your useful answers. I had a wedding last week so I did not have time before now to get back to it.

So, first of all here is what I use the box for. It should be said that this is SOLELY a storage system. I do not run any VMs on it, media servers or so. It really only hosts data for my vSphere Cluster and Plex server. All connections are 10G

So, there is 2 pools.

One pool is made of 12x6 TB in a RAIDZ2 pool with 2 vdevs. This is being used 90% for "slow" filehosting, 10% for VMs (Backup). Media files, Software, Documents and my VDP Backup VM. 95% of the sharing for this pool is done over the NFS protocol but for the VDP instance, I use iSCSI with sync=always. About a terabyte of data is being read every night. I also have a single iSCSI drive for a netbooted server. This server however runs all its apps in memory so no disk load is generated here. The pool is being accellerated by 2 mirrored Intel S3700 100GB SSD's

The second pool is consisting of 12x2TB 7.2K HDD's in a Mirror-Stripe pool with 6 mirrored vdevs. This is being used 100% for VMs and the hosts access this share solely over iSCSI. All VMs but the VDP runs here. Also a lot of data is being read here every night because of the backup. The pool is being accellerated by an Intel Optane 900p 280GB SSD.

My wife and I watch a lot of movies, all of them are my ripped Blu-rays which usually takes about 30GB of space. We also watch TV shows which is mostly 3-5GB of size. My suspicion is that ZFS also puts the TV and movie stuff into the ARC and "steals" space from the VM data. I don't recall an ability to turn off the ARC for a specific pool...

I hope this answers your questions. I have yet to put in the L2ARC but working on it..
 
Status
Not open for further replies.
Top