Deduplication or compression for backup

Status
Not open for further replies.

xhoy

Dabbler
Joined
Apr 25, 2014
Messages
39
Hi,

We have a really nice working freenas server with lots of space and good performance, we use a small slog for SYNC writes. The NAS is used for 3 vmware ESXi nodes that mount 3 different datasets With NFS. (production, test, staging). On these esx notes there are mostly windows machines (some linux).

Now to the question. We recently replaced all our windows servers (2008) with 'new' 2012 servers. Now we are left with some 'old' servers that we don't need any more but would like to keep for future reference.

My thought was to create a new dataset and enable compression and/or de-duplication and copy all old VM's there as a 'live-backup'.

Now there are some things I would like to know:
  1. Is this the correct way of doing this? (or does any of you have a better suggestion)
  2. if we would choose to do this with a data set what would be the best setting to minimize diskspace?
    1. if using de-duplication
      it is said that this costs a some ram, is this true for 'inactive' blocks as well?
    2. Using compression what settings would your recommend?

Thanks in advanced!

EDIT: Ow, it an E3-1245 v3 ,32GB ECC, 6x 2TB WD enterprise, Raid-Z2, total capcity 6.7T. Total 'backup' size is ~1.8TB, shared along 5vm's.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Compression can't hurt on modern systems.

Deduplication will easy bite you in the ass. You'd need a crapton of RAM. Probably even several craptons.
 

xhoy

Dabbler
Joined
Apr 25, 2014
Messages
39
Compression can't hurt on modern systems.

Deduplication will easy bite you in the ass. You'd need a crapton of RAM. Probably even several craptons.

so you would suggest using gzip(level 6+) instead of lz4?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
so you would suggest using gzip(level 6+) instead of lz4?

You can try, but I'm not sure it's worth it over lz4.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
so you would suggest using gzip(level 6+) instead of lz4?

I would suggest you do your own testing in-house on your actual data and figure out what compression is suitable for your exact situation. ;)
 

xhoy

Dabbler
Joined
Apr 25, 2014
Messages
39
Oke, i did some testing, at least I tried.
Every time I start a copy action to the new data set everything gets REALLY sluggish and the respond times of the box sky rockets (from <10ms to > 250ms)

Is there a way to prioritize this? I know I could just renice it but I would think ZFS had a dataset priority feature? Couldn't find anything except renice?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Uh... ZFS is in the kernel. It's already wicked fast. But using compression (or dedup) is CPU intensive. I don't even do gzip 9 on anything because it can take my Xeon and put it on its knees.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If you're archiving, gzip -9 is basically a one-time hit but probably offers the best savings.
 

xhoy

Dabbler
Joined
Apr 25, 2014
Messages
39
THe problem is not that it's not fast. But a Local mv command uses *MUTCH* i/o that other threads (like my nfs) mounts break.

Since it only for archiving i tought that it (could) be possible to just run it in the background of give it a low i/o priority.

-- edit ---
Ow and its not the CPU that is the problem, the 8 2TB disks just cant handel 200mb/s + some random I/O from 10 vm's.... witch results in super high datastore latancies
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
Perhaps you should have created a pool containing striped mirrors, instead of using RAIDz2.

... some random I/O from 10 vm's.... witch results in super high datastore latancies
 

xhoy

Dabbler
Joined
Apr 25, 2014
Messages
39
Perhaps you should have created a pool containing striped mirrors, instead of using RAIDz2.
yes i maby could. but i haven't :)

So moving the files is just for house keeping, and i think it should not be possible to create that mutch load with one process, that block all other i/o.... or am i missing something?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
yes i maby could. but i haven't :)

So moving the files is just for house keeping, and i think it should not be possible to create that mutch load with one process, that block all other i/o.... or am i missing something?

Pool load is not determined by the number of processes. One process flooding the pool with massive writes is effectively the same as 50,000 processes flooding the pool with one write per second.

Any pool can become flooded if the amount of traffic exceeds its capacity to cope. When that happens, latency increases, which usually slows the rate of requests down to whatever the pool is actually able to cope with.

This can have the unfortunate side effect of seeming to "block all other i/o", but it isn't actually doing that... it's just increasing latency. Probably dramatically.

As an administrator, you are expected to create a pool that is capable of sustaining your I/O load. RAIDZ is a poor choice for this, because IOPS of a RAIDZ vdev is closely related to the IOPS capacity of a single component device in that RAIDZ.

Additionally, you are expected to live within the IOPS budget that you have created for yourself (by designing the pool in whatever manner you've chosen). If you choose to flood your pool with I/O above and beyond what it is capable of sustaining, what exactly do you expect it to do?
 
Status
Not open for further replies.
Top