Recommendation regarding RAID of 24 drives

plissje

Dabbler
Joined
Mar 6, 2017
Messages
22
Hi guys,
So I've I finally got our new office lab server connected. The spec on it is:
Dell R730XD
x2 E5-2620 v3
128GB of RAM
x24 intel DC s3500 1.6TB SSD

I will need it as VM storage for (probably) our XCP-NG servers (as Broadcom will probably butcher the VMWare pricing). We will be using NFS as XCP-NG doesn't support thin provisioning in iscsi. The storage network (TrueNAS + all servers) are connected via DAC cables to a 10GBe switch.

Now my question is this:
As this is a full SSD RAID, what would be my best option regarding the disk setup? Obviously mirrored vdevs are the way to go if I want only performance, but unfortunately I do need to balance my capacity as well. (or maybe I don't, I do still get 19.2 TB of usable storage here which is pretty good.)
Taking that into account, I was thinking about going 3 vdevs of RAIDZ1 with 8 drives in each. Or maybe 6 vdevs of 4 drives RAIDZ1.

I would appreciate your insights here and if I'm missing something or being crazy for even thinking about something other then mirrors.

Thanks!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
vdevs of RAIDZ1
How happy are you with data loss...

While you're resilvering, you'll be without redundancy and can potentially be unable to recover from data corruption.

If that's OK to risk in the name of maximizing capacity, then go ahead.

Otherwise, I would suggest 3x8 in RAIDZ2.

Also know that RAIDZ is specifically bad for block storage, which is your primary use case...

The fact that you're all SSD may mitigate that a bit, but you should still understand the issues you will have in that context.
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
  • I have a similar system 24 x SSD.
  • I have encountered an issue with zfs trim with 2x8 raidz2
  • I have switched to a stripe of mirrors several months ago and now it works well.

If you want to leave VMware, try PromoxVE instead of XCP-NG.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
How happy are you with data loss...

While you're resilvering, you'll be without redundancy and can potentially be unable to recover from data corruption.
Fair point, but SSDs have much lower URE rates than HDDs and resilver faster so most of the "RAID5/RAIDZ is dead" argument does not apply here.

Also know that RAIDZ is specifically bad for block storage, which is your primary use case...
This is probably the best argument for mirrors. I'd say 12 2-way mirrors, or 11 mirrors plus one or two hot spares.
But, at 50% occupancy or less, that's only 11*0.8 = 8.8 TB capacity.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Fair point, but SSDs have much lower URE rates than HDDs and resilver faster so most of the "RAID5/RAIDZ is dead" argument does not apply here.
How long does it take you to know about a drive failure, do the physical replacement and then begin the resilver... it's that time that I'm referring to... you're working without parity during that window and can't recover corruption that happens in it.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
If you want to leave VMware, try PromoxVE instead of XCP-NG.
There are pros and cons to each. I use and like Proxmox; others will prefer xcp-ng. Trying both is probably the best way to go. Here's what looks like a pretty good comparison between the two:
 

plissje

Dabbler
Joined
Mar 6, 2017
Messages
22
Hey,
So regarding the Proxmox XCP-NG thing, I know of Lawrence's video and was part of my decision to go with XCP. If there are good arguments as to why you guys think Proxmox is the better option I would love to hear them.

Regarding RAID-Z being bad that is a good read, I'll look into it more.

Fair point, but SSDs have much lower URE rates than HDDs and resilver faster so most of the "RAID5/RAIDZ is dead" argument does not apply here.


This is probably the best argument for mirrors. I'd say 12 2-way mirrors, or 11 mirrors plus one or two hot spares.
But, at 50% occupancy or less, that's only 11*0.8 = 8.8 TB capacity.
Why 0.8? my drives are 1.6 so that should be around 19 TB which is sort of OK. Not bad, not terrible. If the difference between mirrors and anything else regarding my use case is just that big, then the storage gains just not worth it if my performance tanks.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
A couple of points for each system:
  • As noted in the video, Proxmox includes the web management console in itself, rather than in a separate VM; with xcp-ng, you need to create a VM and install it.
  • Connected with that, Proxmox doesn't put any features behind a paywall. You can do XO from source and get almost all the features, but AFAIK you can't do (e.g.) XOSAN without a paid subscription. Proxmox is completely full-featured with or without a subscription, but restricts access to the "enterprise" (i.e., more stable) updates repositories to users with subscriptions. It will give you a nag screen when you log in to the web console if you don't have a subscription, but that's easily disabled.
  • Proxmox' support for PCI passthrough doesn't seem to be as mature as that in xcp-ng. If you need this feature, this probably favors that system.
  • I'm finding that LXC is a nice, lightweight way to run applications, without requiring a full VM for them--much like jails. I don't believe xcp-ng supports them.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Is RAIDZ2 also so bad when SSDs are used? My understanding so far has been that the problem is the number of IOPS compared to mirrors.

Theoretically this would mean that RAIDZ2 is still worse than mirrors in the case of SSDs. But at the end of the day the question, in my view, is not about relative IOPS. If the single SSDs provide enough IOPS so that even a RAIDZ2 (or 3 of them) satisfy my need (incl. buffer for the future of course), that is the critical part. If mirrors deliver e.g. 3 times more IOPS than I need (or even can utilize?), what is the benefit?

Is there an error in my thinking?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Why 0.8? my drives are 1.6 so that should be around 19 TB which is sort of OK. Not bad, not terrible. If the difference between mirrors and anything else regarding my use case is just that big, then the storage gains just not worth it if my performance tanks.
Recommendations for good performance with block storage are 1) mirrors and 2) no more than 50% occupancy.
So 11 mirrors vdevs * 1.6 TB * 50% occupancy = 11 * 0.8 TB = 8.8 TB usable before you should consider getting more drives and/or larger drives. That's indeed 19.6 TB of raw storage space, with slots for two hot spares and not taking account of ZFS overhead and of possible benefits from compression.
 
Top