Will it 100Gbpe?

Refill2630

Cadet
Joined
Jan 19, 2023
Messages
4
Hello,

I am planning on building all SSD NAS and hope to achieve burst speeds up to 100Gbpe. My use case is serving 12 10Gbpe servers (boot drive as well as all the data). 12 servers will run XCP-ng and my SSD NAS will be the fast storage.

So far the components look like this:

I understand that 24/7 100Gbpe performance is not possible on this set up and SATA is quite the limitation, but having the ability to have "burst" load of 100Gbpe is quite important for the servers.

Servers will have 10 or 25 Gbe connection.

What are your thoughts? Is there a huge issue or a bottleneck that I do not see?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
My use case is serving 12 10Gbpe servers (boot drive as well as all the data). 12 servers will run XCP-ng and my SSD NAS will be the fast storage.
I assume you're talking about iSCSI here... so you're also saying you're going to run CORE.

You haven't mentioned SLOG, but with sync writes being your design, you'll need one (and a super-fast pool to not slow the writes down after the SLOG is full... in 5 seconds)

Read this: https://www.truenas.com/community/threads/the-path-to-success-for-block-storage.81165/
That post talks about HDDs as the pool, but the theory is going to apply at the speeds you're hoping for here.

Mellanox MCX416A-CCAT CX416A Dual-Port ConnectX-4 100GbE PCIe Adapter NIC
Potentially a poor choice. Chelsio is the "preferred" option for those who know what they are talking about (and iX). Note the what card to pick section here: https://www.truenas.com/community/threads/10-gig-networking-primer.25749/

First you need to check your math... 16 drives means you could do 8 mirrors (and that's what you should do if that's as many drives as you can manage) Maybe allow for a spare or 2, but add many more, see below.

100Gbps equates to about 11GBytes/s

With SATA limiting you to ~500MBytes/s per drive, you'll need at least 22 drives to get the data on or off the drives that fast (that's just in a stripe with no parity).


I don't know what magic you think these will do, but depending on your iSCSI workload from the VMs, they may help if there's a very large working set... but more RAM would be much better instead.

I would say as a general comment that you've got a lot more thinking to do about what you really want/need and how much cash you have to match those needs.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
...
RAID-0 is a poor choice, I am guessing you meant Mirrored. Since you reference a PCIe card for 2 NVMe L2ARC devices, these appears to be your TrueNAS boot devices. You don't really need 500GBs for TrueNAS boot devices, as 16GB to 64GB is good.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
100GbE is a tricky goal and I think I haven't seen anyone seriously try without NVMe disks.
The two HBAs theoretically can push a bit over 100Gb/s combined (2x cards, 8x 8Gb/s, for 128Gb/s) - but that's if the disks can keep up, and that doesn't sound viable with so few SATA 6Gb/s disks.
 

Refill2630

Cadet
Joined
Jan 19, 2023
Messages
4
Thank you for your replies!

I assume you're talking about iSCSI here... so you're also saying you're going to run CORE.
After some testing I am going to use NFS as XCP-ng VM storage. I am not going to use iSCSI as it is not worth it.

You haven't mentioned SLOG, but with sync writes being your design, you'll need one (and a super-fast pool to not slow the writes down after the SLOG is full... in 5 seconds)
Absolutely, I think of using 2x Kingston KC3000 NVMe 1TB for L2ARC and another set for SLOG.

Potentially a poor choice. Chelsio is the "preferred" option for those who know what they are talking about
Gatekeeping at it's finest. For those who know. Irony aside, used 2x25Gbe for ~$150 is unbeatable value right now.
First you need to check your math... 16 drives means you could do 8 mirrors (and that's what you should do if that's as many drives as you can manage)
For a YT video "achieving" 100Gbpe. Maybe. Doing 8 mirrors in real life? Thank you, but no. You are right that there is a typo in original post. reasonable combination (IMHO) is 2x raidz vdevs of 8 with no hot spares, 3x raidz vdevs of 5 with 1 spare or something in-between.

With SATA limiting you to ~500MBytes/s per drive, you'll need at least 22 drives to get the data on or off the drives that fast (that's just in a stripe with no parity).
That is true. That is why I mentioned cached VM drives in RAM - drive performance is close to 40-60Gbps in theory, with bursts from cache into something close to 100Gbps.


I don't know what magic you think these will do, but depending on your iSCSI workload from the VMs, they may help if there's a very large working set... but more RAM would be much better instead.
Absolutely. RAM is always better. I am trying to do a budget build that will serve a hosts and VMs and that looks like "not quite 100Gbps but much better then just 25Gbps".

RAID-0 is a poor choice, I am guessing you meant Mirrored
Depending on the motherboard setup, sometimes you have to create virtual disks so TrueNAS just can see them.
You don't really need 500GBs for TrueNAS boot devices, as 16GB to 64GB is good
Agreed, but that is just cheapest NVMe SSD with reasonable longevity available locally.

100GbE is a tricky goal and I think I haven't seen anyone seriously try without NVMe disks.
The two HBAs theoretically can push a bit over 100Gb/s combined (2x cards, 8x 8Gb/s, for 128Gb/s) - but that's if the disks can keep up, and that doesn't sound viable with so few SATA 6Gb/s disks.
Agreed. But if I expect something ~ 50Gbpe sustained with reads from RAM in 100Gbe that should do the trick.


Thanks again for all the comments and concerns.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
After some testing I am going to use NFS as XCP-ng VM storage. I am not going to use iSCSI as it is not worth it.
My comments about sync writes will still apply to NFS.

Absolutely, I think of using 2x Kingston KC3000 NVMe 1TB for L2ARC and another set for SLOG.
At 800TBW rating, the SLOG may last a while, but L2ARC will burn out much more quickly.

I think it's generally a poor choice and you should spend the cash on more RAM and/or an Optane SLOG.

If you do use those for SLOG (you really only need 1), then look up the thread on NVME overprovisioning to maximise the lifespan.

3x raidz vdevs of 5 with 1 spare or something in-between.
That will give you the IOPS of maybe 3 drives for a potentially IOPS heavy workload... your choice.

In general, I think you haven't fully understood the ZFS features you're trying to use and that will result in performance that you will find disappointing compared to your expectations.

Best of luck with it though.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
reasonable combination (IMHO) is 2x raidz vdevs of 8 with no hot spares, 3x raidz vdevs of 5 with 1 spare or something in-between.
2x8 drives in RAIDZ1 is hortible. And I mean from a reliability point of view since sretalla already mentioned the performance issue.
Please read the following resource.
 
Last edited:

Refill2630

Cadet
Joined
Jan 19, 2023
Messages
4
2x8 drives in RAIDZ1 is hortible. And I mean from a reliability point of view since sretalla already mentioned the performance issue.
Thanks! Great read.

So for future reference: the performance of 1 vdev is the performance of a single drive.

N-wide RAIDZ, parity level p:
  • • Read IOPS: Read IOPS of single drive
  • • Write IOPS: Write IOPS of single drive
  • Streaming read speed: (N - p) * Streaming read speed of single drive
  • Streaming write speed: (N - p) * Streaming write speed of single drive
  • Storage space efficiency: (N - p)/N
  • Fault tolerance: 1 disk per vdev for Z1, 2 for Z2, 3 for Z3 [p]
I guess I will have to sacrifice the IOPS for read streaming read speed and have something like raidz1 across 5 drives.
 

Refill2630

Cadet
Joined
Jan 19, 2023
Messages
4
That will give you the IOPS of maybe 3 drives for a potentially IOPS heavy workload... your choice.
You are right and I was wrong. I guess I will have to have 2 vdevs of 2 mirrored 2TB drives and slowly increase the pool as the budget allows, slowly getting 8 vdevs of 2 mirrored SATA SSDs
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I guess I will have to sacrifice the IOPS for read streaming read speed and have something like raidz1 across 5 drives.
Just be aware that streaming speeds refer to large files, small files require way more IOPS.
Ideally the next best thing performance-wise if you don't want mirrors would be 5 vdevs composed of 3 disks in RAIDZ1 each, with an hotspare that helps reliability.
It's not great for resiliency, but it's better.
 
Top