Tuning ARC / A2ARC for backup

Status
Not open for further replies.

memhog

Dabbler
Joined
Dec 28, 2013
Messages
11
My goal is to have continuos backups (lets say every 5 - 10 minutes) use the least amount of resources from the sending machine. What would be best is if I could keep all the the most recent writes in ARC (or L2ARC).

I am not clear on the definition of LRU when referring to writes.
As a worst case I have a flood of writes coming from the network that are being flushed to disk. Are writes considered "Recently Used" in regards to the cache?
If writes are considered "RU" then what tuning should I look at to try to minimize going back to the HDD pool when the system does a snapshot and then sends it to the backup server?
Finally, if I knew that these writes are streaming (not being read back) I wonder if there is a way to release that ARC / L2ARC pool of writes after they are used for the backup process.
I apologize if I am confusing different ZFS mechanisms.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
You aren't going to get what you want with the ARC/L2ARC. In short, data isn't stored in the L2ARC until its been requested multiple times. And then its not removed from the L2ARC until its reached the point where its "expired". There's also limited throughput to the L2ARC so you aren't writing tons of data to the L2ARC you will never use.

In short, what you want to do isn't going to be possible.
 

memhog

Dabbler
Joined
Dec 28, 2013
Messages
11
You aren't going to get what you want with the ARC/L2ARC. In short, data isn't stored in the L2ARC until its been requested multiple times. And then its not removed from the L2ARC until its reached the point where its "expired". There's also limited throughput to the L2ARC so you aren't writing tons of data to the L2ARC you will never use.

In short, what you want to do isn't going to be possible.


So does the written data stay in the ARC until space is needed? Or is written data in a different buffer altogether?
I am trying to avoid having to go all the way back to the Zpool for data that I know I will be writing again to
the backup server in a relatively short period of time.

Do you have other suggestions to optimize for this particular backup process?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
You WILL have to go back to the zpool for data. No recommendations except to get enough hardware and optimizations to deal with the performance penalty(if it even affects the servers ability to do its job). I will warn you that doing 10 minute snapshots is going to be painful to administer. Most people find that 2 hour snapshots is just as effective. In the past when people have demanded 5 minute snapshots and as soon as they see the price tag of achieving such feats they change their mind. So the organization should determine if they are actually willing to pay for that kind of resolution. Otherwise, just stick with 2 hour or daily or whatever.
 

memhog

Dabbler
Joined
Dec 28, 2013
Messages
11
You WILL have to go back to the zpool for data. No recommendations except to get enough hardware and optimizations to deal with the performance penalty(if it even affects the servers ability to do its job). I will warn you that doing 10 minute snapshots is going to be painful to administer. Most people find that 2 hour snapshots is just as effective. In the past when people have demanded 5 minute snapshots and as soon as they see the price tag of achieving such feats they change their mind. So the organization should determine if they are actually willing to pay for that kind of resolution. Otherwise, just stick with 2 hour or daily or whatever.


Does "price tag" refer to complexity or that I will need much more equipment (bandwidth, storage, etc)?
I came across this approach for automating the backups
http://zpool.org/2013/09/06/zfs-snapshots-and-remote-replication

and thought that it would be great if I could somehow tune my system to not have to retrieve the last X minutes from HDD (if I necessary accommodate it with additional SSD - either L2ARC or separate zpool - or worst case somehow segment RAM for this purpose).

Grateful for you insights.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Does "price tag" refer to complexity or that I will need much more equipment (bandwidth, storage, etc)?
I came across this approach for automating the backups
http://zpool.org/2013/09/06/zfs-snapshots-and-remote-replication

and thought that it would be great if I could somehow tune my system to not have to retrieve the last X minutes from HDD (if I necessary accommodate it with additional SSD - either L2ARC or separate zpool - or worst case somehow segment RAM for this purpose).

Grateful for you insights.

"Price tag" is a combination of whatever is needed based on performance required, hardware available, bottlenecks that may exist, and several other factors. In short, you're going to have to figure this one on your own as the issue is far more complex than can just be explained away in a few minutes of typing.
 
Status
Not open for further replies.
Top