ZFS with Deduplication - SSD Cache for write speed up?

alekals

Cadet
Joined
Nov 13, 2018
Messages
3
hello I'm trying free nas on my forklift (workstation) with these hardware features
xeon 4 cores 3ghz
16 GB RAM ECC (if necessary I can bring them to 32)
1 X 120 Gb ssd (installed freenas poll boot)
3 X 4 TB WD red RAIDZ1 (data pool)

I activated the deduplication on zfs and for now I'm happy, I use NFS and copy my backups to about 60/70 Mb per second. NOT BAD!

Deduplication is not an option for me, I use it to save 10 days of vmdk images or kvm images of my Virtual machines.

Deduplication really makes me comfortable.

This setup at the moment ,have about 110 GB of SSD not partitioning and not used, this is a waste!


That said my goal would be to get about 100Ms or more in copy.
Adding the part of the 110Gb and setting it as zfs cache I would get some improvements? would it also help me if 16 GB of ram were not enough?

In the zfs cache, the data is only cached, so in case of a break, do not I have to worry about it?

my idea of write cache ssd would be that this quickly receives my backup and then simultaneously goes to the hdd more slowly, at the end of the copy I still have some writing until the cache is empty.

I still do not understand how these zfs caches work, sorry for that
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
You better do a lot more reading on ZFS and how it works. Deduplication takes a large amount of memory to function correctly and is likely killing the performance of your setup. You are also no where close to having enough memory to run L2ARC (cache as you call it). You're idea of how ZFS works and how it actually works are two completely different things.
 

alekals

Cadet
Joined
Nov 13, 2018
Messages
3
I read a lot about zfs and deduplication even though I do not have years of experience with zfs unlike other filesystems I use.

I know that I should throw at least 128gb of ram before I worry about l2arc.

My simple thought came from the fact that my reading has not yet found a place where they fully explain how the ARC and L2ARC of ZFS work.
However I have this useless 120gb partition on ssd, the question is simple is there a way to use it safely for zfs in write?
Otherwise it will remain there, when I will have performance problems it is my intention to withdraw another 32 GB/64Gb of ram from another PC and put it on the current freenas.
What do we do with these 120 GB SSD ???
 
Last edited:

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
What do we do with these 120 GB SSD ???
If its a boot drive as you say, nothing. The boot drive is only good for boot. 120GB SSDs are cheap enough that it still makes sense as a reliable boot device. In ZFS cache (L2ARC) is only for read operations of frequently used data. It will not help for backups. Write "cache" does not work like you would think with classical filesystems. Its is only for synchronous writes ad will not be faster than you pool. For deduplication, you need more RAM.
 

alekals

Cadet
Joined
Nov 13, 2018
Messages
3
I found this interesting reading, here a little explains how arc l2arc works.
https://drive.google.com/file/d/0BzHapVfrocfwblFvMVdvQ2ZqTGM/view
SSD that I used seemed to me better than a usb stick, it was there inside the forklift, I used it to boot and away.
Thank you for your time.

With reference to the ram.
For the deduplication issue, you think that 12 GB of ram is enough if I set this setup

1 dataset for 4 TB NON-DEDUPLED
1 dataset for 4 TB Deduplicate.

In this way I could copy my vm during the overnight backup on the 4 non-dedup tb (4 days of capacity), and then move the data older than 4 days on the deduplicated part with complete calm.

In this setup the Ram should be large?

In the case I see a drop in performance, I will provide an additional 16 Gb of ram ECC
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
In this setup the Ram should be large?
I have no idea how well your data will dedup, what the block size it, or how much data you have. without exact number for all of those it's basically impossible to say. I would guess its still on the low side if you want any performance gain from you ARC.
From CyberJocks slides
An L2ARC shouldn’t be bigger than about 5x your ARC size. Your ARC size cannot exceed 7/8th of your
system RAM. So for a system with 32GB of RAM, you shouldn’t go any bigger than 120GB. This is why
maximum system RAM first is a priority!
This is at best a vague guide and is not in anyway how to actual calculate your Max reasonable L2ARC. One of the biggest factores is the zfs block/record size as each record needs to be referenced.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The short answer is "if you want faster writes, turn off deduplication" but you should probably turn it off anyways because 12GB is way too little RAM to use it safely. Even if your data was exclusively 128K records (the default) you'd start to choke your arc_meta at a little over a terabyte I believe.

I'd like to see the command output of zpool status -D; please post inside of code blocks.
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Interesting, that one isn't in the man page. How did you find it?
Dark rituals involving sacrificing the data of people who didn't use ECC. ;)

Serious answer: I have no exact idea when or where I learned of this parameter, but it was probably in my Solaris days. But now you know, and knowing is half the battle. (The other is realizing that deduplication has a very, very limited set of use-cases, and is almost never the right answer.)
 
Top