All Flash datacenter storage

remonv76

Dabbler
Joined
Dec 27, 2014
Messages
49
Hi All,

So we have been using our Freenas storages (4x) for many years. 2 Primary and 2 backup units (snapshot replication), SAS Drives with SSD caching. Now it’s time to replace these units, because more storage and IO is requested. But we need some advise.

We are planning to use 4x Dell R730xd servers, but because we build all of them the same, i will explain one config.
We need advise if we should use L2ARC, which dedup config, metadata config, and any tips are welcome.
Network speed is 40Gbps+N for storage, 10Gbps+N connection ESXi server (10x). So failover is key, data consistancy is important and the risk of losing data has to be minimalised.

VMware infra consists of +60 virtual machines and growing (all centos/almalinux servers), mainly webhosting, databases and application development. Also we host, on a small vmware datacenter set, a Microsoft environment for about 12 Windows 16/19/22 servers with Microsoft SQL and an Exchange server.

We will be using NFS, so we focus on max 80% of the total capacity. With iSCSI it is 50%.

Hardware config
2x 12 Core Xeon E5-2650v4 / 2,2GHz
Memory per storage: 768GB
Main storage: 24x960GB Samsung PM883 SATA with dedup or 1.8TB SAS without dedup
SLOG per storage: 2x RMS-200 8GB.
L2ARC: maybe 2x nvme Samsung 983 DCT M.2 1920 GB
Dedup/Metadata: 3x nvme Samsung 983 DCT M.2 960 GB

PCI slots are full, so a dual optane is no option. Using a dual 4x nvme adapter card is the better option. Because now we have 8x nvme slots.

We tested RMS-200 and they perform ok in low sector sizes. 4k,8k,16k. Not as good as the x4801, but endurance is key for a zpool. So we‘ll still be putting them in mirror to the zpool. Lose a card, lose the pool. Maybe we will get the RMS-375, but these are hard to find.

Budget is a little bit constraint, because we need to replace 4 units. Also dedup will help the durability of the SSD disks -> less writes to the zpool.

Fragmentation will be an issue with SAS, but not so with SSD. So this is our consideration for the zpool config.
SSD: 2x 11disks RAIDZ2 + 2 spares
or SAS: 3x 7disks RAIDZ2 + 3 spares

Why dedup? Most of the data (like OS data) are the same. So we gain a lot. We need around 10-12TB per storage and forsee a growpath of 2-3TB per year, per storage. After 5-6 years we will replace the units again.

Questions:
Will a L2ARC help with the all flash storage? We predict that we have around 500GB of memory for ARC, so thats the first performance gain. The 1.9TB nvme disks are fast in the low sector sizes. And we think a striped L2ARC will definately help over a 22x disk zpool. We calculate 3x the memory, so we partition the nvme disks with 1.2TB.

Is it a good idea to trim the slog tuneable from 4GB to 6GB or even the full 8GB? Has anyone seen any performance gain with this?

We focused on a lot of memory. So most requested data is in memory cache. We want to set a limit of 25% for dedup table.
We want to use 3x 960GB nvme, split them in special vdev partitions for dedup and metadata and add these as triple mirror to the zpool. So dedup data, metadata are on fast nvme.
If we add special vdevs, does it still use memory for the dedup table?
Do we gain anything adding special vdevs, when using an all flash pool? For SAS it’s a no brainer. Yes it’s faster. For SSD, we just don’t know.
The Samsung DCT PM983 960GB has an endurance of 1.36PB or 1.3DWPD. This should be enough or is there a better alternative? (m.2)

We have absolutely no idea how big a dedup table can grow? Is there any real life data we can look in to, instead of the 1-3GB per 1TB truenas sais?
And how big can metadata grow. Are we talking about 10GB, 100GB or bigger?

How can we upgrade a special vdev to bigger disks? This is a tricky one, because we know how to replace a 600GB Mirror to a 1TB mirror and expand the partition from 600 to 1TB. But with a special vdev, it is a different story, i think.
 
Last edited:

FrankH

Cadet
Joined
Jun 25, 2022
Messages
5
Interesting project, I am planning something similar on a small scale for personal use. In the meantime, have you been able to solve some of the unanswered questions?
 
Top