ALL FLASH (NVMe) Hardware Requirements: Quantity + Gen CPU sufficient for 12x NVMe devs

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Apparently, I'm the jackass who sees the entire class waiting to walk in & tries to open the door anyway.

I thought I understood how to read hardware requirements & determine compatibility. I've connected 7 NVMe devices that were all x4 ... to an i7-8700K (which have only 16 PCIe lanes on the CPU). Granted, only 4 were on an x16 (HighPoint SSD7120) RAID array. Ultimately I rejected a consumer system (an i7-9800x (44 PCIe Lanes!) + a X299 (Gigabyte - Designare motherboard w. many x16 slots, integrated 10G-baseT & TB3) because it lacked ECC.

But when I decided on an R730xd (with 80 PCIe lanes on the CPU (E5-2600v3)) which seemed a good deal and more than enough lanes ... I thought it'd at least be adequate for ~12x NVMe drives. 80 lanes..?? How could it not..??


But Dell won't even sell you more than 4x NVMe (at their RIDICULOUSLY over-priced NVMe pricing) and Intel says the same thing; that you can only use 4x NVMe (this is a pic of Intel's statement) ... I'm willing to trust I'm wrong -- but is that the case..? Or is this a matter of them not preserving the access to the remaining HD slots or something?

Intel NVMe qty per Dell Model.png


Ironically, after almost a YEAR of tryina sell an R930, I finally sell it only to see it on this list in the exact role I wanted!
If not, why aren't 80 CPU -- PCIe lanes adequate for 12 NVMe drives..?
I thought I needed ~48 lanes -- (for a max of 12 NVMe devices, if not less).



Task / Role
Quantity + Role
Notes:
RAIDz-2 Array
8x NVMe
Fusion Pool
2 Optanes (mirrored)
(small files + MetaData)
SLOG
Persistent NVDIMMs
Previously used an AIC - but to save space...
L2ARC
Add RAM
It's everyone's advice anyway (& faster..?)

I intend to upgrade to a dRAID array when TrueNAS supports it, as it rebuilds much faster.
Fusion Pool: (for small files + metadata). I need to calculate my existing zPool's size would be.

If nothing else -- hopefully I at least got a decent deal on the R730XD (still in transit)
Hardware I either: have, am buying, or want (as deals come up).


Dell PowerEdge R730XD, 2P
  • Dual Xeons, E5-2600 v3 (upgradeable to v4)
    • INSTALLED: E5-2640v3 Clock: 2.60 GHz Turbo: 3.40 GHz
    • Not bad, but if SMB performance is based on single-core performance, I'll at least upgradeable to a faster clock or possibly a v4 with a faster clock...
      • E5-2643v3 | SR204 (6c Base: 3.40 GHz Turbo: 3.70 GHz) for $120 ea
      • E5-2667v3 | SR203 (8c Base: 3.20 GHz Turbo: 3.60 GHz) for $130 ea
        - or -
      • E5-2643v4 | SR2P4 (6c Base: 3.40 GHz Turbo: 3.70 GHz) for $280 ea
      • E5-2667v4 | SR2P5 (8c Base: 3.20 GHz Turbo: 3.60 GHz) for $220 ea
  • RAM (ARC) / SLOG / Fusion Pool -- ANOTHER MISTAKE:
    • To reduce NVMe devices, I'd hoped to use Optane P-DIMM (cheapest per GB Optane) but that'd require an SP / LGA 3647.
    • So, more expensive (but ... faster?) the next option would be Persistent NVDIMM (at about $5 per GB I think).
    • Using NVDIMMs as my SLOG would reduce ≤ 2 AIC (if the single Radian RMS-200 8GB) is too little & even ultimately slower.
    • But, a Fusion Pool requires mirrored Optanes (to protect the zPool ): Either 905P, P4800X or 900P, @ ~380GB each.
    • Seems like a real accomplishment to pick up any of those models for about $1 x GB, especially the smaller sizes.
    • For conventional DDR4 ECC ... I plan to start with ~64GB (in 8GB modules) for ~ $200 which'd leave 16 memory slots free.
  • HBA Controller:
    • SuperMicro ReTimer -- AOC-SLG3-4E4T
  • Networking:
    • Chelsio: T520-SO-CR (SFP+)

FINALLY! Would you believe this took a while ??
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Yes, I can believe it.

Intel, (currently the top contender on my excrement list for the CPU security flaws), seemed to artificially limit PCIe lanes in their CPUs and chip sets. You have generally have to go higher end CPU for more PCIe lanes. And in some cases, dual socket.

AMD at least made reasonable choices. The AM4 socket, with it's 24 lane limit, (just from the CPU, not hub / chip set), was intended for normal desktop use. And was designed before NVMe drives became common. (Their Threadripper and Epyc on the other hand did go a bit extreme at 128 PCIe lanes...)

My new NAS board was designed as a smaller board, Micro-ATX. Yet with it's Epyc CPU, the board has so many PCIe lanes that it's silly.
  • 4 x 16 lane PCIe slots
  • 2 x M.2 4 lane PCIe slots
  • 3 x Slimline U2, each with 8 PCIe lanes
That adds up to 96 PCIe lanes. Here is the board;
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Yes, I can believe it.
Intel (top of my excrement list for the CPU security flaws), seems to artificially limit PCIe lanes in their CPUs and chipsets.
You have generally have to go higher end CPU for more PCIe lanes. And in some cases, dual socket.

But, this is an 80-lane CPU. What precludes this from supporting 12 NVMe x4 devices (at only 48 of those 80 lanes) ..?
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
None of the Dell 13G systems were built to support NVMe drives, they're simply too old. There's some very limited support I think in 14G. The 15G's bring in native support, but you also have to consider the other factors involved in the engineering. Some of the lanes are dedicated to the card risers, some of the lanes are dedicated to onboard networking (that is modular from the OEM standpoint, so lanes may get wasted if you don't order 40GbE, etc...), there are breakout cables for SAS/SATA controllers, PERC, etc... These are NUMA machines, PCIe lane access is not symmetrical. Just because you have two sockets, doesn't mean your backplane gets to hop on the other CPU's 80-lanes, etc...
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
None of the Dell 13G systems were built to support NVMe drives, they're simply too old.

NONE..? Too old..? At least they support 4 I guess..?
Even if I use non Dell HBA (PERC) adapters in whichever slots have the PCIe lanes..?

Some lanes are dedicated to riser cards & others to onboard LAN (modular from the OEM which're wasted if you don't use a 40GbE, etc...)

So better to use a modular mezzanine SFP28 card (if I can find one) than use an SFP28 in one of the finite PCIe slots (& lanes) ..?

PCIe lane access is asymmetrical. Two sockets don't guarantee your backplane access to all 80 CPU-lanes...

I'm familiar with the contrast between the physical size vs electronic-bandwidth of PCI slots, lanes, & that they're tied to a specific CPU.
That said, aside from the shown-PCIe lane mapping's 2-models (which I'll see soon enough) what other finite resources am I contending with?


Controlling CPU
Qty of Slots
PCIe Gen 3.0 Lanes
Power
Form-Factor of Slot
Slot Location
CPU 1
0 / 1*
PCIe 3.0 x16
75 watts
Full-Height
Center Riser
CPU 1
2 / 3*
PCIe 3.0 x8
75 watts
Full-Height
Left Riser
CPU 2
1
PCIe 3.0 x16
75 watts
Full-Height
Center Riser
CPU 2
3
PCIe 3.0 x8
25 watts
Low-Profile
2x Left / 1x center



If BOTH: [CPU 1] & [CPU 2] have x16 slots, this configuration would provide 10 x [NVMe (U.2) devices].
...with the remaining x8 slots used for AIC Optanes or products like the Radian (DRAM with Capacitors).

SuperMicro Model
Qty Req
SSDs
Lanes
1​
2​
x8​
2​
8​
x16​
Fusion Pools: AIC Optane | SLOG: NV-DIMM​
1 - 2​
2 - 4​
x4 Lanes per AIC​


If only [CPU 2] has an x16 slot, this configuration would provide 10 x [NVMe (U.2) devices].
...with the remaining x8 slots used for AIC Optanes or products like the Radian (DRAM with Capacitors).

SuperMicro Model
Qty Req
SSDs
Lanes
3​
6​
x8​
1​
4​
x16​
Fusion Pools: AIC Optane | SLOG: NV-DIMM​
1 - 2​
2 - 4​
x4 Lanes per AIC​


SPANNING STORAGE ACROSS THE QPI (Quickpath Interconnect):
I recall a caveat that distributing a pair of Persistent DIMMs (in app-mode) across 2+ CPUs introduces significant (relatively) latency.
That said, am I mistaken to believe that this would be limited to devices near DRAM performance to be an issue..? (Not regular NVMe devices).


Assuming the QPI (Quickpath Interface) isn't a concern ...
And ensuring I connect HBA devices to slots which actually have real CPU PCI lanes...
What are the other finite resources I need to contend with..?

And regarding QPI ... there are CPUs I'm considering upgrading to which increase it's bandwidth by 20% (shown below).

The CPUs I'm considering all have even more QPI bandwidth (9.6 GT/s) and start below $50ea ...
The slowest of which is an 8c at 2.6GHz -- but I'm leaning towards either a:
  • E5-2637v3 or v4
    - or -
  • E5-2643v3

All of which have very high minimum base clocks and QPI throughput of (9.6 GT/s)
My use-case including at most 1-2 VMs
2 simultaneous users (max)
Samba | Scrubbing | Checksums
...and the NVMe drives.
Of which (from what I understand) only scrubbing is optimized for multithreading
And everything else has performance bound to the speed of a single core.
As well, I just read on Calomel.org that they recommend disabling HyperThreading!


CPU Choices | cores + prices.png
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
In CASE the QPI actually is a factor for which legitimate concerns seem warranted ...

QPI link speedUnidirectional peak bandwidthTotal peak bandwidth
6.4 GT/s​
12.8 GB/s​
25.6 GB/s​
8.0 GT/s​
16.0 GB/s​
32 GB/s​
9.6 GT/s​
19.2 GB/s​
38.4 GB/s​

Which ... IF IT WERE a bottleneck for me -- would be GLORIOUS. :-D
as that'd mean that HALF my array ... were "limited" to 12.8GB/s :)
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Threadripper and Epyc on the other hand went a bit extreme at 128 PCIe lanes...
My new NAS board was designed to be a smaller, Micro-ATX, yet with Epyc, it has so many PCIe lanes that it's silly.
  • 4 x 16 lane PCIe slots
  • 2 x M.2 4 lane PCIe slots
  • 3 x Slimline U2, each with 8 PCIe lanes
Which is 96 PCIe lanes -- AsRock ROMED6U-2L2T

I'm feeling like the 'between the lines' consensus is a ... 'plug it in and figure it out for yourself' scenario.
I just figured SOMEONE has tried to use a 2P E5-2600v3 or v4 to connect 10-12 NVMe x4 drives... :)
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Dell Gen 13 systems only have four NVMe-capable drive bays. The rest are SAS/SATA only. That's the immediate concern that's limiting the number of NVMe drives to four. Only Gen 14 and 15 have all-NVMe options, and even then it's not always a very clear situation. E.g. the R6515 is sold in a "up to 9 NVMe drives" configuration - but that's nonsense, there's a single 10-bay backplane for all models that supports SATA/SAS plus NVMe separately on all bays, they just won't tell you that.

But let's get back on track: What on Earth are you trying to accomplish? I see a ton of premature optimization here for an unstated objective.

Do you want to connect as many NVMe drives as you can to a Xeon E5 system for fun and profit? You could go to ridiculous lengths with external NVMe chassis and adapters (with options ranging from nearly-passive cards to ludicrously-expensive tri-mode HBAs plus tri-mode SAS expander/PCIe switch combinations).

Do you want to perform a given task? What task is that? Why does it need so many NVMe drives? If it would benefit from them, wouldn't it perhaps benefit from a more modern platform than Xeon E5?

Do you just want a simple solution to use your NVMe drives for no particular reason? Then just get a server that does that and be done with it. Tons of options these days.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
  • 4 x 16 lane PCIe slots
  • 3 x Slimline U2, each with 8 PCIe lanes

PS -- there's a HighPoint (they say RAID but if it's like the SSD7120 it'll work as an HBA also) ...

HighPoint SSD 7580 ... with (4) PCIe 4.0 x4 ports (which would ALONE support 8x NVMe 3.0 x4 SSDs ...!
A little pricey ... but the amount of PCIe 4.0 HBAs in the world are few and far between:

Of which, these are the only two x16 PCIe 4.0 HBA I even know of:
RIGHT: QIP4X-PCIE16XB03 (if it exists) ........... and LEFT: The HighPoint SSD 7580 (which I believe the HighPoint works both as RAID or HBA)

QIP4X-PCIe16xB03.jpg
.
HighPoint SSD 7580.jpg
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
That said, aside from the shown-PCIe lane mapping's 2-models (which I'll see soon enough) what other finite resources am I contending with?


Controlling CPU
Qty of Slots
PCIe Gen 3.0 Lanes
Power
Form-Factor of Slot
Slot Location
CPU 1
0 / 1*
PCIe 3.0 x16
75 watts
Full-Height
Center Riser
CPU 1
2 / 3*
PCIe 3.0 x8
75 watts
Full-Height
Left Riser
CPU 2
1
PCIe 3.0 x16
75 watts
Full-Height
Center Riser
CPU 2
3
PCIe 3.0 x8
25 watts
Low-Profile
2x Left / 1x center

The "supported" NVMe solutions from Dell, what they tested and sold would have been tightly controlled. So you're likely looking at modern devices and assuming some translation back to Dell's statement of support. Dell will only "support" their and their partners tested and certified solutions, so you're kind of left adrift once you get to something as old as a 13G R730 and modern devices.

Your limiting factor is likely power. Most of the devices I'm familiar with run around 25 watts. Some of the bigger PCIe stunt cards may go higher. The dual M.2 stick PCIe board from Dell's 14G era I think required a 50 watt slot, so it's wouldn't be certified for use in the three LP slots in the table above, etc... Can you use a single third party 25 watt M.2 PCIe board in those slots? Probably... Will Dell support you? Probably not.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Of which, these are the only two x16 PCIe 4.0 HBA I even know of:
RIGHT:
That's not an HBA. That's a dumb passthrough card, doesn't even look like it has redrivers. It would require the slot to support bifurcation into x4/x4/x4/x4 (Dell R630s and R730s do on at least one slot). Think of it as a cable more than anything else.

LEFT: The HighPoint SSD 7580 (which I believe the HighPoint works both as RAID or HBA)
There's also the LSI SAS 95xx stuff, like the SAS 9500-16i. Broadly similar. That said, Dell Gen 13s are strictly PCIe 3.0 only, so I'm not sure what the advantage would be.

Would you recommend using [one parity drive] in a flash-array?



Wow, that's really generous, sure. And of course, that's greatly appreciated.
Are you going to be shipping it or ..?
Let me put it this way: You have spent/are spending money on NVMe drives. There are certainly applications that will take as many as you can throw at them. There are many more applications that simply do not benefit.

If your scenario is "I have a 10GbE network, a Dell R730 with the NVMe-capable backplane and 7 NVMe drives. With the minimum amount of money/effort, how can I get the best performance?", the answer is quite simple: Use four drives, sell or keep the others for the future. And get whatever CPU works for what you're doing (Single client? Fewer, faster cores. Many clients? Many cores.). Note that, if not already present, you need either the Dell Gen 13 NVMe enablement kit or something similar, like the left card in your post plus cables. The drive bays are U.2, which means that PCIe is entirely separate from SAS/SATA, so you can theoretically use anything you want (iDRAC may be more or less happy depending on your choice of card).

Anything else is going to get expensive.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
I'm feeling like the 'between the lines' consensus is a ... 'plug it in and figure it out for yourself' scenario.
I just figured SOMEONE has tried to use a 2P E5-2600v3 or v4 to connect 10-12 NVMe x4 drives... :)
Sorry, my last few jobs as a Unix SysAdmin have not been on the side of the house that researches and buys servers.

Some vendors actually test things, and make limits that they can support. So adding more NVMe drives might reduce the overall performance, IF you were trying to use many or all the NVMe drives at once. Vendors like hard answers, so give a reasonable limit that they know will work. Not saying this vendor has done the testing, nor if they made the limits based on testing or engineering design.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
PS -- there's a HighPoint (they say RAID but if it's like the SSD7120 it'll work as an HBA also) ...

HighPoint SSD 7580 ... with (4) PCIe 4.0 x4 ports (which would ALONE support 8x NVMe 3.0 x4 SSDs ...!
A little pricey ... but the amount of PCIe 4.0 HBAs in the world are few and far between:

Of which, these are the only two x16 PCIe 4.0 HBA I even know of:
RIGHT: QIP4X-PCIE16XB03 (if it exists) ........... and LEFT: The HighPoint SSD 7580 (which I believe the HighPoint works both as RAID or HBA)

View attachment 48968 . View attachment 48967
Thanks for the info. I had planned to add NVMe drives in the future. Glad that more options are coming out.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Deleted -- but the sentiments & validity are unchanged.
 
Last edited:

rvassar

Guru
Joined
May 2, 2018
Messages
972
Well... That must have taken a while to type.

Have fun with your project.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
PS, While I sure a condescending and rude person would take credit for a clear explanation they didn't offer...
No matter HOW much anyone knows, if they can't explain it concisely and clearly ..? It's worthless.


Below is what I figured out for myself:


The R730XD has adequate PCIe lanes and CPU performance to support perhaps 16 NVMe and an SFP+ before sharing lanes. What it LACKS ..? Is a BACKPLANE with more than 4x NVMe slots. NO OTHER TECHNICAL REASON EXISTS.

Intel and Dell just want more money from enterprise customers than to provide such an option.

Unless some bizarre (and infinitesimal likelihood) exception exists, like, the R7415 or R7425 are compatible w R730?

Then, the only backplanes you can use with an R730 are those dell makes which is at most 4x NVMe slots...

Which means -- you'd be limited to 4 U.2 NVMe devices, which is NOT A LIMITATION TO 4 NVME DEVICES.
Just 4x U.2 NVMe devices ... due to the BACKPLANE.
This is the rub with Enterprise gear; compatible is irrelevant if it's not manufactured.

That said, you can still add as many devices via PCIe cards as there are slots to use! So!

4x U.2 NVMe
It's POSSIBLE that 2 could fit in the back slots ... ? (configuration dependent)

another 4 NVMe devices... via the x8 PCIe slots
- either via M.2 (which would support 2 per card)
- or AIC devices: Optanes, etc.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Thanks for the info. I had planned to add NVMe drives in the future. Glad that more options are coming out.

Yup, I didn't know if your motherboard supported bifurcation or not, but I just know there are very few devices.

The HighPoint I mentioned because I did pick some up at good prices via eBay before I learned about the SuperMicro card ... but in general, the more things to look for (if it's compatible with that AsRock) the more odds you have of finding deals.

Thanks again for the advice ... I really wanted an Epyc but in an enterprise system -- so I'm really just waiting for the 7415 or a 7425 deal to come up; I've seen some that I'd have TOTALLY snagged.

(Though remember, you'll need to justify your usage to an auditor before you're allowed to think you know what's best for your data. ;-) )
 
Top