Datasets and Replication - Keep them from getting too big

Status
Not open for further replies.

dpearcefl

Contributor
Joined
Aug 4, 2015
Messages
145
Because I believe in sharing with others the mistakes I have made in hopes that others can learn from me, here is my latest mistake...

I have a FreeNAS Mini XL fully stocked. I had a dataset that contained all of the Veeam backups for our VMware environment. I have a second FreeNAS Mini XL that I replicate to.

The dataset eventually grew to 18 TB (terabytes) in size. The replication when it worked would take 3-4 days to transfer. Eventually it stopped working because the replication would fail.

Networking: Plugged into the same Cisco 2960X switch so it should be as fast and stable as possible.

Don't let your datasets grow up to be too big.

I am now in the process of splitting the one big dataset into four child datasets. This should make the replication cycles quicker and more resilient. I'll let you know.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
Being courious: what is the total efficient (not the raw) size of your pool?
 
Joined
Jan 18, 2017
Messages
524
Holy smoke 3-4 days? was it transferring the entire 18TB every time?
 

dpearcefl

Contributor
Joined
Aug 4, 2015
Messages
145
upload_2018-3-22_14-30-12.png


75% of it because most of it changed.
 

PhilipS

Contributor
Joined
May 10, 2016
Messages
179
I don't know how Veeam backups work, but I can share what I do to handle database backups.

Since any new block write to ZFS will need to be replicated, I didn't want to directly write my DB backups to FreeNAS where a significant portion of the backup is the same as the previous backup (not encrypted and not compressed). So I write the backups to a scratch area and then rsync --inplace over the last backup that is stored in FreeNAS - this limits the blocks that are modified and greatly reduces my snapshot sizes. I also use gzip compression on the DB backup dataset since space trumps performance in this case.
 

dpearcefl

Contributor
Joined
Aug 4, 2015
Messages
145
Veeam backups are massive files that are typically compressed and deduped already. In our cause, the files almost completely change each time backup is run. Therefore a lot of replication traffic.
 

ecunningham

Cadet
Joined
Nov 21, 2017
Messages
5
I'm interested in learning how others handle this as well. We have a similar sized data-set that we backup with Veeam to FreeNAS (~20TB). Daily change rate is tough to calculate but I estimate about 1 TB which needs to replicate every day to our offsite FreeNAS device (identical system 15 miles away). Our bandwidth is 200 Mbps, latency 3 ms, and our replication was typically taking 12 hours to complete. I had snapshots running every hour, replicating, and retaining for 48 hours. It would sometimes fall behind on weekends when the larger full backups would run but usually had caught up by Monday afternoon.

Trouble came when we added additional virtual servers to this target through Veeam (we backup less important servers elsewhere locally). The replication fell behind and the snapshots grew (as they were retained until replication completed) until they consumed all of the space. Veeam files are huge and as Veeam deletes old backups to make room for the new ones, the old snapshots don't release the space until replication completes freeing the snapshot.

We're looking to increase our bandwidth to 500 Mbps, possibly add larger disks, and maybe add a WAN accelerator but I'm wondering if my snapshot/replication schedule can be adjust to ease some of this pain. Would 1 snapshot every 24 hours be better? It would retain less but the replication wouldn't start as quick. I had hoped to retain snapshots on my remote end but that doesn't seem feasible considering how large the snapshots get.

Anyway, hoping to see more discussion around what others do with similar setups.
 
Status
Not open for further replies.
Top