Interpreting the ARC Reporting Graphs - or - would I benefit from more RAM under these conditions

Status
Not open for further replies.

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
How does one go about interpreting the ARC graphs on the reporting page when trying to decide if there would be a benefit to adding more ram?

I currently have 48GB of RAM and a raidz2 with seven 8TB drives (more hardware details in signature). I could quite easily and cheaply add another 24GB of RAM since the general consensus seems to be "the more RAM the better" but for all I know, we're barely using what we have already. The graphs below represent a typical afternoon where both of us are in full production mode culling folders with many thousands of Canon RAW files, designing albums, batching JPEG's and other photo post production type activities.

Based on what you see here, would there be any tangible benefit to more RAM or would I just be wasting my time and money and the system is barely breaking a sweat as it is?

Screenshot 2017-11-30 17.21.07.png
Screenshot 2017-11-30 17.34.28.png
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340
exitmusic,

I'm not going to claim to be an expert on this, but here's my two cents:

From what I understand, if ARC data that is being heavily used is getting moved out of ARC to make room for new data that is being heavily used, that might indicate that your working set or the total sum of all the heavily used data on the system is more than the total amount of ARC you have. In theory, if you have more ARC space and more of your typical heavy use data fit into ARC, you'd see a higher hit percentage. Reading from ARC is usually much faster than from the pool, so you'd expect to obtain a performance increase of some kind (whether noticeable or not , eh?)

This is probably also dependent on what you're doing with the system. Systems running random workloads like storage pools (datastores) for VMware and media servers with a bunch of 10-50 GB movie files have different performance characteristics and different ARC needs.

In either case, most of the posts I've read and my experience playing with FreeNAS has led me to believe you can sometimes say you need more ARC space if your hit ratio is below 80-90%. I tried to find the ARC performance posts about this topic, but I couldn't. I'd cruise the forum searching for ARC performance and see if you can get a few better explanations than mine.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Because you are using some of your system swap, I would say that it might be good to increase your RAM.
You have a lot of activity in in your ARC, but the system is not able to predict (very well) what data will be needed next, so your "ARC Hit Ratio" isn't very high. Mine is always up around 75%, but that is because I use the same data repeatedly and the system can predict that it will be needed again, so it keeps it in ARC. Because of the changing nature of your work (a guess on my part) your system will still probably not have a high accuracy at guessing the things to keep in ARC, but with more memory to work with, it is going to keep more things in ARC and the chance of a match goes up just because of the quantity of data held in ARC.
It is a little bit of a guessing game, litterally, but I would say more memory would help you.
 

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
Thank you very much. That makes a lot of sense.

Yes, there is often a lot of bouncing around from one client to another client's folder, each of which might be 200-300GB. I've been trying to conceptualize the ARC hit ratio, and this helps. More ram means a bigger target for ARC hits, so therefore a higher hit ratio.
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
Thanks a ton, that's exactly the kind of insight I was looking for.

This was an interesting little summary comment on that thread:

"We've been saying this for years.
You want the working set to fit within ARC (or ARC+L2ARC if you must). Well within ARC, ideally.

Performance falls off dramatically if the working set isn't within the ARC (ARC cold after a reboot, unusual traffic flushed stuff out of ARC, etc). Like jumping off a cliff."

If my "working set" is a 200-300GB folder of 30-50MB files, I don't think I'd be able to practically install enough RAM to fit all of that into ARC. Is this the time for some L2ARC? I was under the understanding that that was typically the domain of much larger systems than mine.

Additionally, how much of this matters if everything is on a 1Gbe connection for now? I'm investigating 10Gbe but I think that's a little bit down the road.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
When you get a file for editing, are you able to see the transfer speed? Is it maximizing the 1GB connection?
ZIL is the thing that most people don't need, unless they use block storage.
The thing that keeps us from suggesting L2ARC is that you must have enough RAM to handle the lookup table (not the technical term) for the data on the SSD that you use for L2ARC and the drive must be fast and reliable (endurance) although some of the problems with this have been improved in the last couple years.
The general rule is to maximize the system RAM first and then consider L2ARC.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
It's not so much the opening of a single file when we need the speed; it's when we point our image browsing software to a folder of 8000 of the files and say "show me thumbnails and previews of all these files" so we can start tagging and labelling for future import into Lightroom. That's the biggest holdup right now and where I'd like to get the most performance.

In those instances, the Tx on the network interface usually tops out at about 300-500Mbits/sec. What I'm unsure of is if that remaining transfer speed is being left of the table because my local CPU is sloshing through the image data it's getting to generate previews or if the server is waiting for data from the pool.

I'm just looking to eliminate bottlenecks anywhere that I can find them.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
You can look at the local machine CPU memory and disk use to see if it is the choke point.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
In those instances, the Tx on the network interface usually tops out at about 300-500Mbits/sec. What I'm unsure of is if that remaining transfer speed is being left of the table because my local CPU is sloshing through the image data it's getting to generate previews or if the server is waiting for data from the pool.
This is what amounts to a 'large random' workload in terms of how the NAS is handling all those file accesses. Once the files are read, they may be held in ARC in case they are accessed again, but the first time is going to be slow because of the small number of vdevs you have. Actually, just one.
To really accelerate the speed of this type of work, you would need to organize your storage on mirrored vdevs. I did some quick figures, and this is just an estimate, but you would need to have 10 drives in mirrors to get the capacity and speed. That would give you 5 vdevs and the number of vdevs are where you get speed from. Generally, more vdevs equates to more potential for speed when you are doing random workloads.
 

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
Ok, I'll definitely keep that in the back pocket for future upgrades. I'll max out RAM for now and see where we are.
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340
Ok, I'll definitely keep that in the back pocket for future upgrades. I'll max out RAM for now and see where we are.
If you think of it, report on your results after doing so. I'd like to see any tangible effects on your particular work load increasing the ARC size has.

It seems like the two ways this could go is:
1. (Likely) best choice - increase system RAM allowing for more data to fit into cache
-and if that doesn't work, or doesn't work well enough...
2. Increase the disc performance of the pool so data can be read faster by increasing your zdevs

I'd like to see if/that we're all correct...

Good luck!
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Contrary opinion: If your working set was 300GB, you'd have 300GB on your workstation.

It sounds like the access pattern is batch processing, so there isn't much to be gained from excessive cache. If I am wrong, then the truly smart thing to do would be buy a 500GB SSD for each workstation, copy the relevant data to it, run your processing on the workstation at new amazing speed, then store output back to ZFS.
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340
Contrary opinion: If your working set was 300GB, you'd have 300GB on your workstation.

It sounds like the access pattern is batch processing, so there isn't much to be gained from excessive cache. If I am wrong, then the truly smart thing to do would be buy a 500GB SSD for each workstation, copy the relevant data to it, run your processing on the workstation at new amazing speed, then store output back to ZFS.
So you're suggesting some sort of local copy or locally cached copy? Wouldn't that require a sync or replication strategy since they want to write these "tags" or some kind of metadata to the files (assumption)?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
It's not so much the opening of a single file when we need the speed; it's when we point our image browsing software to a folder of 8000 of the files and say "show me thumbnails and previews of all these files" so we can start tagging and labelling for future import into Lightroom. That's the biggest holdup right now and where I'd like to get the most performance.

In those instances, the Tx on the network interface usually tops out at about 300-500Mbits/sec. What I'm unsure of is if that remaining transfer speed is being left of the table because my local CPU is sloshing through the image data it's getting to generate previews or if the server is waiting for data from the pool.

I'm just looking to eliminate bottlenecks anywhere that I can find them.
What are you using to browse and tag?
What kind of computer is it?

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
We use Adobe Bridge CC 2018 and Camera Bit's Photo Mechanic on a 2013 6-Core trash can style Mac Pro.
 

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
The above is what I work on. There is also an iMac, two Mac Mini's and a Mac Book Pro that makes an appearance on the network once in a while.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I looked into this, checked on the software and on the Mac Pro (not the others) and it looks to me that the two places where you would get the most speed improvement are the modification to the storage configuration I mentioned previously and in moving to a 10GB network. Neither of those are inexpensive or easy. With the Mac Pro, you have the thunderbolt ports to connect things, but you will need to adapt the thunderbolt to networking hardware to connect to the server. You would also need a 10GB switch (if you want to share the speed with the office) and a 10GB network card in the server.
In those instances, the Tx on the network interface usually tops out at about 300-500Mbits/sec. What I'm unsure of is if that remaining transfer speed is being left of the table because my local CPU is sloshing through the image data it's getting to generate previews or if the server is waiting for data from the pool.
Before you could take advantage of any of the improved speed that 10GB networking would offer, you would need a faster storage pool. The speed you are seeing now, is basically as fast as the disks in the pool are able to deliver data, based on some rough calculations I did, estimating the mechanical speed of the drives you are using. So, the first step would be faster storage and this would give you better performance right away and even better performance once you were able to go to 10GB networking. Due to the fact that faster storage requires more drives, because each individual drive is a slow point, and the only way to make it faster is to use a lot of drives and only put a little data on each drive, simultaneously. The way the math works out, each drive does not need to be large, you get a quantity of storage by having a lot of drives and you also get a lot of speed by having each drive only handle a little data. The SAN that we have at work is pretty new, we just brought it online last year and, as an example, it uses 300GB drives, but it uses about 200 of them because of the speed that needs to be available to respond to multiple, simultaneous, requests for data from a network with about 1000 users.
Because of the 8TB drives, I am guessing that the storage server you have is pretty new, but if the need for speed is serious, you might want to consider the possibility of a replacement. I can put together some hardware suggestions if you are interested in this.
 

JamesNisly

Dabbler
Joined
Nov 21, 2017
Messages
37
To be clear, that network transfer speed I mentioned above was 300-500Megabits/sec (small b) so about 40-60MegaBytes/sec. So about half what 1Gbe should theoretically be able to handle. Do you think my pool in it's current configuration is really topping out at 60MegaBytes/sec transfer speed? That seems surprising.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
To be clear, that network transfer speed I mentioned above was 300-500Megabits/sec (small b) so about 40-60MegaBytes/sec. So about half what 1Gbe should theoretically be able to handle. Do you think my pool in it's current configuration is really topping out at 60MegaBytes/sec transfer speed? That seems surprising.
Unfortunately, yes, but that is partly due to the large number of relatively small files being accessed and partly due to mechanical limitations of the drives. Not knowing the exact model of the drives you are using, I can only estimate the mechanical transfer speed of the individual drives, but with that estimate and knowing you are using RAID-z2, I figured that you should see a maximum of 300 to 400 MB/s from the pool. That would be great for large sequential file transfer, but when you have the kind of work that you are doing, the system is touching every file in a directory to generate a thumbnail. That is relatively slow even on a locally attached SSD. It is the slowest kind of file access. There are things that can be done, like I was talking about above.
 
Status
Not open for further replies.
Top