Replication task fills tibibytes of space with 2GiB snapshot

ChubbyBunny · Dec 19, 2018

Hello, the company I'm working at is having a problem with the freenas replication.
It starts the replication task and although the file is small if I watch the free space on the replication target machine it becomes smaller and smaller until there's nothing left at which point the backup fails. There's more than 2 TiB available so it shouldn't have this issue.

Screen Shot 2018-12-19 at 2.29.31 PM.png

This is a screenshot of the target machine's tank with the remaining space. I took this while the new replication is happening, it should have more available space like somewhere in the neighbourhood of 2.8 TiB but it's been running for a while now.

Here are the emails it sends while it's doing this:

The capacity for the volume 'tank' is currently at 94%, while the recommended value is below 80%.
The capacity for the volume 'tank' is currently at 95%, while the recommended value is below 80%.
The capacity for the volume 'tank' is currently at 96%, while the recommended value is below 80%.

Hello,
The replication failed for the local ZFS tank/data1 while attempting to
send snapshot auto-20181115.0600-2w to 192.168.128.34

I deleted a 900MiB snapshot that I thought was the issue but sadly it has become hung up again on a different snapshot.

System Information
Build FreeNAS-9.10.2-U6 (561f0d7a1)
Platform Intel(R) Core(TM) i5-3470T CPU @ 2.90GHz
Memory 16252MB
System Time Wed Dec 19 13:31:53 PST 2018
Uptime 1:31PM up 20 days, 22:14, 0 users
Load Average 1.69, 1.81, 2.23

Screen Shot 2018-12-19 at 1.27.51 PM.png

Screen Shot 2018-12-19 at 1.35.13 PM.png

I've only set up the email system yesterday and got emails about a snapshot from November which means that the freenas servers have been taking bandwidth on the local network for days now...

I've inherited this freenas system from a previous employee who's no longer here so I'm new to these issues and systems. Let me know what steps or direction I should take.

ChubbyBunny · Dec 19, 2018

I can't see the reply from you Chris but I did get it via email:

you are sharing your pool for iSCSI. This means you should never fill the pool beyond 50%. So you need to double the size of your pool just the have the minimum capacity and probably go 3x to give you some room to grow.

Did you talk to anyone here before you built this?

I inherited this from the previous IT guy who inherited it from the IT guy who came before but it is professionally built and has IX systems branding at the back with the IX systems phone #.
I found the model online now by looking up the CPU and it's this one:
https://www.smallnetbuilder.com/nas/nas-reviews/32162-ixsystems-freenas-mini-plus-reviewed
The FreeNas Mini Plus from 2013. (it looks exactly the same and has the same CPU)

toadman · Dec 21, 2018

Maybe I don't understand well enough (highly likely).

From the snapshot list you showed, it says the "Refer" size (5th column) is 4.4TB. If that's the case, and the target system for the replication doesn't already have the data on it, then won't the entire 4.4TB transfer over? i.e. the filesystem on the target system has to have all the data in the 4.4TB present. Then subsequent snapshots will just carry the deltas.

Is that the case? the 4.4TB referred to is already present on the target system as part of the 5.4TB listed in your first post? Or is the 5.4TB on the target system different data than is contained in the snapshots? in which case wouldn't the source try to add the 4.4TB in the snapshots as additive to the 5.4TB already present on the target?

For example. On source I create 100GB of data. I snapshot it and send it over to target. Now target has 100GB used. On source I delete the original 100GB and the original snapshot and create a new and different 100GB. I snapshot the new data. I send that new snapshot over to target. Now target will have 200GB. The original 100GB in the first snapshot and the new and different 100GB in the 2nd snapshot. (Unless configured to remove stale snapshots.)

ChubbyBunny · Jan 23, 2019

The data up to that point was the same on both of them, it happened out of the blue that it seemingly stopped recognizing it as being the data it was used to.
In the end the solution that I used was that I destroyed the dataset on the replica server, waited for the disk to clear, and then turned the replication task back on on the primary. This has been working for one day, we'll see if this happens again.

Apollo · Jan 23, 2019

@ChubbyBunny, The issue is not on the source pool but the backup pool.
I think your drive is either too small, or already contained more snapshots than the source.
Readup on snapshots and ZFS blocks you will understand, if not ask.

pro lamer · Jan 24, 2019

@ChubbyBunny
Can it be this case

pro lamer said:
New version
https://thr3ads.net/zfs-discuss/2012/08/2092244-ZFS-snapshot-used-space-question

?

Sent from my phone

Important Announcement for the TrueNAS Community.

Replication task fills tibibytes of space with 2GiB snapshot

ChubbyBunny

Cadet

ChubbyBunny

Cadet

toadman

Guru

ChubbyBunny

Cadet

Apollo

Wizard

pro lamer

Guru

Similar threads

Important Announcement for the TrueNAS Community.

Replication task fills tibibytes of space with 2GiB snapshot

ChubbyBunny

Cadet

ChubbyBunny

Cadet

toadman

Guru

ChubbyBunny

Cadet

Apollo

Wizard

pro lamer

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Replication task fills tibibytes of space with 2GiB snapshot"

Similar threads