Slower than expected results on my newer Storage Server

Rob88

Cadet
Joined
May 7, 2023
Messages
4
Hi Everyone,

I have been reading these forums for a long time and have run out of ideas for figuring my issue out. I had an old system that I had been using for a while, and it worked great. Then I got lucky with a Dell server that I could grab cheaply from work. Here are both systems.

Old System:

OS - Truenas Core 13-U4
Motherboard - Asus X99-E WS
RAM - 128GB(8x16GB) Samsung LRDIMM 2133mhz
CPU - Xeon E5 2699 V3
Boot Drives - Mirrored 256GB SSD
Storage Spinning Rust - 6 X Seagate 7E10 8TB 3.5" in RaidZ2
Storage NVME Striped - 4 X Samsung 970 EVO Plus 1TB
NIC - Mellanox ConnectX-4 Lx
PSU - EVGA G3 1000watt

New System:

OS - Truenas Core 13-U4
Dell r730XD
RAM - 512GB(16x32GB) Samsung LRDIMM 2133mhz
CPU - Dual Xeon E5 2697 V3
Controller - HBA330 (Non-Raid passtrough)
Boot Drives - Same as old
Storage Spinning Rust - Same as old plus a midplane of 4 more 3.5" drives in dual mirrored Vdevs
Storage NVME Striped - Same as old plus another 4 X Samsung 970 EVO Plus 1TB for two sets of NVME Striped
NIC - Mellanox ConnectX-4 Lx
PSU - Redundant 1100watt

For my issue with my old system, I was able to saturate my 25gbe network. My synthetic testing with crystal disk mark was at 2.9GB/s Read and Write. SMB transfer was about 2.69GB/s Read and Write. It was amazing.

My new Dell server is set up the same way. Synthetic in crystal disk mark is about 2GB/s~ Read and 1.1GB/s Write. SMB transfer was about 1.35GB/s Read, and 1.15GB/s Write.

For the life of me, I can't get it to perform as well as my old system, which is essentially the same hardware while sharing the same generation.

So far, I have fully tested my network with other systems plus iperf between my main computer and server, and it keeps getting a solid 23.5gbps. So I believe it's safe to say the Nic and switches are not holding it back. Next was internal drive testing with DD in the shell. That is showing about 264MB/s Read and Write on spinning rust and about 3GB/s Read and 2.8GB/s Write on NVME. Next, I tested the Raid for both storage and NVME. The Seagates are giving about 450MB/s Read, and 250MB/s Write. NVME was giving 12GB/s Read and 11GB/s Write.

Next, I dove into possible issues when I installed a fresh copy of Truenas Core and just reloaded my config to rebuild my setup on the new server. So I did it again to see what would happen, and it still did not change getting about the exact same performance.

The last thing I verified with myself was that when I run a Read/Write test from two different machines connected at 25gbe hitting the server at the same time doing an SMB transfer, I end up getting 2.83GB/s Read and 1.34GB/s Write. So I know it can at least hit the theoretical max for writes with multiple sources hitting it, but writing is still disappointing comparatively.

I have not tried reinstalling and destroying my Raid NVME to add them and retest without my old config. I want to avoid doing this if at all possible. I was also looking into tunables, but I do not understand how to add and which ones to add to truenas core.

If anyone can help me start working toward a solution, or if you have a similar Dell server and have had this issue, I am ready to listen. I also will accept the fact that if this is normal for a Dell server to be, then I will take it and be happy with what I have.

Thanks for any help you may be able to provide!
 

Attachments

  • Old System.PNG
    Old System.PNG
    30.6 KB · Views: 86
  • New System.PNG
    New System.PNG
    30.6 KB · Views: 76

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's also worth noting that your old system is a UP system while your new system is DP. UP systems are talking directly to your hardware in most cases, while DP may need to proxy I/O through to the other CPU in order to perform I/O operations. Coupled with the crummy Mellanox card, that may not be helping you out. Your DP system also has a HBA330 in it, which also extracts its own (mild) performance tax. DP systems equal more compute, more memory, and more I/O, but can result in performance bottlenecks especially under singlethreaded workloads.
 

Rob88

Cadet
Joined
May 7, 2023
Messages
4
Thanks for the response. I also have an H730P that I had in HBA mode. The performance did not change, swapping them out a all. Should I go back to that controller?

For the NIC, what is a good option? I have many other Mellanox cards that all perform well currently. I do have some Broadcom 25gbe cards. Are they any better? I only ask this because where I work, we mainly have Mellanox and Broadcom for NICS and Cisco for switches.

I never considered how it has to move through different parts of the system and cause single tasks to suffer. I have never dealt with just single stream tests in a data center, so I never really thought about it—a noob moment for me. I thought that since the backplane, NIC, and one of my NVME cards run to just CPU 1, it would work more like my Old system. My 2nd NVME card does use CPU 2 Lanes.

From what I gather so far, in my specific case with my hardware, this is more of a normal case of more focused single streams being hindered for the most part. I just want to be able to figure out why the multistream writes refuse to go faster than 1.34GB/s. I do know for my home lab and light sharing with family, this is really still good. This is mostly my obsession and hobby, so my curiosity will not let me rest on this.

To answer your question about tuning, are you referring to the tunables in Truenas Core? If so, I have not due to not being able to figure out yet exactly how to load them properly and which ones. For any other type of tuning, I have done network tuning to no avail and have tried many options in the Server to optimize performance, with nothing working so far.

If you have some specific thing you think I should try, I am ready to try them out. Thanks again.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
H730P that I had in HBA mode
You need to read this:

Note: RAID card in "HBA mode" does not count as an HBA, it's still a RAID card.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thanks for the response. I also have an H730P that I had in HBA mode. The performance did not change, swapping them out a all. Should I go back to that controller?

No, absolutely not. These do not have an "HBA mode." They have a "lobotomized RAID controller mode" where you are still using the RAID firmware and the RAID driver but it is just being used for single disk access. However, it is some of the finer points that are at question, such as how device errors are handled, how device disconnects present to the system, how a hotswap event notifies the host system, etc. The guys who wrote the RAID device driver had "this is a RAID card" on their brains when they wrote it, and the way you handle errors and present events to the upstream OS are different than an HBA. For example, and I'm not saying that your LSI-based card does this, but just as an example, when a RAID1 controller experiences an HDD error, it typically waits no more than ten seconds for an answer, and upon error, immediately requests the data from the other side of the RAID1. It is expecting the data and it is architected towards getting that data, or passing up a generic error. An HBA instead just passes messaging back and forth to a HDD, without any interpretation or processing. So what you have to ask yourself is this: is it likely that they ACTUALLY created two totally different personalities in the RAID card, adding an "HBA mode" and requiring two different drivers on the host platform?

Evidence suggests that these people have no idea what the difference is. For years, RAID cards like 3Ware supported "JBOD" mode, which is where you have a RAID0 virtual device created from each physical device, using the RAID firmware and driver. You couldn't talk directly to the disks and had to do hacky stuff to get access to the SMART data, for example (the infamous "-d cciss" or "-d 3ware" flags), and often the RAID card would pitch a fit if you unexpectedly detached a drive. That might be fine for a Windows box but it's terrible for a NAS. You then have to manually tell the controller about the new disk and possibly delete a config for the old disk and a bunch of crap. It sucked. Correction, it sucks. All the "HBA mode" implementations I've seen seem to be a lobotomization of the RAID firmware and reuse of the RAID driver. I have not seen any cases where this ends well.

For the NIC, what is a good option? I have many other Mellanox cards that all perform well currently. I do have some Broadcom 25gbe cards. Are they any better? I only ask this because where I work, we mainly have Mellanox and Broadcom for NICS and Cisco for switches.

Any of the ethernet interfaces recommended in the 10 Gig Networking Primer in the Resources section. This mostly works out to Chelsio and Intel, with Mellanox probably being okay if you don't need a lot of features and you're a bit masochistic. Forget the Broadcom.

I never considered how it has to move through different parts of the system and cause single tasks to suffer. I have never dealt with just single stream tests in a data center, so I never really thought about it—a noob moment for me. I thought that since the backplane, NIC, and one of my NVME cards run to just CPU 1, it would work more like my Old system. My 2nd NVME card does use CPU 2 Lanes.

This stuff is complicated. There are many things to consider. We had a fellow who had "slow read speeds" and ended up walking thru all sorts of stuff before finding out his network sucked. Educational and interesting.


From what I gather so far, in my specific case with my hardware, this is more of a normal case of more focused single streams being hindered for the most part. I just want to be able to figure out why the multistream writes refuse to go faster than 1.34GB/s.

I'm sure there's an answer. Given time, the bottleneck can probably be located. See the discussion I just linked as an example.

To answer your question about tuning, are you referring to the tunables in Truenas Core? If so, I have not due to not being able to figure out yet exactly how to load them properly and which ones. For any other type of tuning, I have done network tuning to no avail and have tried many options in the Server to optimize performance, with nothing working so far.

If you have some specific thing you think I should try, I am ready to try them out. Thanks again.

Please refer to


Apologies that it is a bit of a mess right now.
 

Rob88

Cadet
Joined
May 7, 2023
Messages
4
I will dig into these posts and see what I can do. When I figure something out, I will report here for either more help or a solution that worked for me so that future forum goers can hopefully solve their problems.

Thanks so much for the help so far. I appreciate it.
 

Rob88

Cadet
Joined
May 7, 2023
Messages
4
I'm back from testing everything I know of.

So, My issue is fixed...Sort of.

I tried every tunable thing in Truenas Core with ZFS and SMB. I tried all the options I could find on the resource page and looked into it more on other posts. This strangely changed almost nothing. I believe dctcp had the only effect. It was negative at first, and then it was consistently giving about a 2-3% boot in transfers both read and write.

Next was SMB. I set up SMB multi-share and boosted it to 100 streams. Of course, this worked to an extent, but it did create some crazy intermittent slowdowns, which would end up affecting the drives that run my plex share due to the drops in transfers, which I believe were actually dropouts every about 15-20sec. So, although I did have a file transfer at 42gbit from my NVME Raid array to my testing rig with a 100gbit card through my Mikrotik 100g switch, this still did not really help at all with my issue with all my other machines and stability. I ended up disabling it for now.

This is where my frustration starts. I ran more tests running my Truenas connected directly through a DAC to my other machine to take away all possibility of my networking equipment causing it, even though I was positive it was not. The good news is it was not that. My networking setup is very complex, and to have this kind of issue would have been a month-long event of finding the needle in that haystack.

This led me to continue my frustration then I tried my SMB between my Windows machines directly again. Sure enough, it was having an issue. My max read and write speeds were capping out at 1.35GB/s on 25gbit NICs. So, I now had a Windows issue (what a surprise). I got off of Windows 11 due to an SMB issue, and it seems more recent updates have caused it to happen in Windows 10 as well. Next, I started up my Linux-based machine for the first time in a few months due to stealing parts from it to keep other machines up. I set up a samba share between Linux and my Truenas server. Sure enough, it was reading at 2.93GB/s. I was so happy and upset at the same time. Happy because I found the area I needed to focus on and was mad due to it being Windows again. Windows is the bane of anyone's existence.

At this current moment, I have played around with Windows and have gotten my reads back to 2GB/s. I do still have a continuing issue with write speeds running at about 1.15GB/s. Still can't totally explain the issue there. To be honest, the single write speed is not much of an issue because that only affects my NVME Raid, and I am not worried about that much of the time. My read to my main pool is steady large file transfer at about 540MB/s, and writes are at about 300MB/s solid through 50+ GB file transfers. I am pleased again because I do have a few important single-stream reads that a critical. Multistream hits are able to max the connection out with no issues, so I am good there.

I will continue to work on my Windows SMB issues, and if I am able to figure anything out, I will share it. I would like to thank everyone again for the help you provided. Till next time.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I tried every tunable thing in Truenas Core with ZFS and SMB. I tried all the options I could find on the resource page and looked into it more on other posts. This strangely changed almost nothing. I believe dctcp had the only effect. It was negative at first, and then it was consistently giving about a 2-3% boot in transfers both read and write.

Yeah, it varies. If you have a sufficiently fast machine, it can keep handing stuff off to the card silicon fast enough that improvements are marginal. Tuning may help more under load however.
 
Top