Can't find all-SSD RAID-Z2 small file SMB performance bottleneck - Please help!

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
Ok, let me be honest: I'm at a loss here and could really use some help. As such, I greatly appreciate any and all feedback and/or suggestions!

The problem
I'm looking to improve my small file SMB performance, but can't seem to find the current bottleneck. I'm pretty new to TrueNAS and ZFS in general, so I wouldn't be surprised if I'm just missing a basic setting somewhere or maybe just have wrong expectations of my hardware. I've been trying to rule things out, but haven't been able to find my exact bottleneck.

My hardware
MotherboardASUS A320M-K
CPUAMD Ryzen 3 2200G
Memory1x 8GB DDR4-2666 Corsair Vengeance LPX
Boot Drive1x 250GB Samsung 980 (NVME M.2)
Data Drives4x 2TB Samsung 870 QVO (SSD) in a RAID-Z2 configuration
Network Interface ControllerEDUP 2.5GBase-T PCIe Network Adapter with an RTL8125 controller running in a PCIe 2.0 x1 slot
GPUSapphire Pulse Radeon RX570 8GB
PSUCorsair CX450M (450W)
Operating SystemTrueNAS Core 12.0 U5.1

I realize that this isn't exactly server grade hardware. Most of the hardware is stuff that I already had laying around and it seemed to be enough for an initial test. I would like to upgrade to a Supermicro A2SDi-H-TF motherboard (with the Intel Atom C3758 processor) and 16GB of ECC memory in the coming weeks, but I'd first like to make sure that such a move would actually help my performance. I'm also aware that the Samsung 870 QVO SSD's are somewhat controversial (mainly due to their QLC memory). However, they should be sufficient for my particular workload. I expect a maximum utilization of 10-20 TBW per drive per year, so they should long outlast their warranty period and performance should be decent enough with their TurboWrite-cache. I'll be using a separate system for back-up purposes.

My test scenario
To test the performance of my TrueNAS system, I'm using one of my previous projects. The project consists out of a total of 68,249 files, which are spread over 1,342 folders and have combined size of 11.5 GB. Most of the files are extremely small, with just a few files responsible for most of the combined size. I've chosen this project as it's a great reflection of my average use case for this system. I'm measuring the performance using the following three metrics:
  1. Upload - The time it takes to copy the project from my main system to the TrueNAS system
  2. Download - The time it takes to copy the project back from the TrueNAS system to my main system
  3. Delete - The amount of files per second that can be deleted from my TrueNAS system

My main system consists of the following (relevant) hardware:
MotherboardASUS ROG Strix B250F Gaming
CPUIntel i7-7700
Memory2x 8GB DDR4-2400 Corsair Vengeance LPX
Boot Drive1x 500GB Samsung 960 EVO (NVME M.2)
Additional Drives1x 4TB Samsung 870 EVO (SSD)
1x 2TB WD Blue (HDD)
Network Interface ControllerASUS XG-C100C (with one 10GBase-T connection)
Operating SystemWindows 10 Pro 21H1

All copies in the test scenario are done to or from the M.2 boot drive of my main system. My main system and TrueNAS system are connected via a QNAP QSW-1105-5T switch and Cat 6a cabling (with a length of max 20 meters).

My hope/expectations
As a point of refence, I ran my test scenario against the built-in (4TB Samsung 960 EVO) SSD in my main system. This resulted in the following performance figures:
Upload (write)6 minutes (average of 32 MB/sec)
Download (read)5.5 minutes (average of 35 MB/sec)
Delete1200 files per second

My hope would be to get within reach of these performance figures with my TrueNAS system. I'm fully aware that small files have always been somewhat of a sore point for SMB file transfers, so I'm not expecting to exceed the local performance figures and/or to get anywhere close to the maximum speed of my 2.5Gbps NIC.

My current performance figures
My test scenario currently results in the following figures when using the onboard 1 gigabit connection of my TrueNAS motherboard:
Upload (write)14 minutes (average of 14 MB/sec)
Download (read)20 minutes (average of 10 MB/sec)
Delete200 files per second

When I use the 2.5 gigabit connection of my PCIe Network Adapter, I get the following figures:
Upload (write)12 minutes (average of 16 MB/sec)
Download (read)18 minutes (average of 11 MB/sec)
Delete220 files per second

Without the few large files in the sample project, the average upload and download speeds would probably be closer to between 2 and 5 MB/sec.

As a reference, I also ran the test scenario over a gigabit connection against a WD My Cloud EX2 Ultra NAS with 2x 8TB WD Red HDD's (which is my previous setup) and this resulted in the following figures:
Upload (write)29 minutes (average of 7 MB/sec)
Download (read)29 minutes (average of 7 MB/sec)
Delete50 files per second

My current performance is definitely better than my previous setup was, but it's still not as close to the performance of a local drive as I'd like it to be.

Potential hardware bottlenecks
I've currently tried to look at the following potential hardware bottlenecks:
  • Network Interface Controller: Switching from a 1 gigabit connection to a 2.5 gigabit connection resulted in a mere 10% performance increase (mainly due to the few large files). I did a separate test with a single large file to confirm that my maximum transfer speed is indeed around 270 MB/sec, so the 2.5 gigabit connection appears to be working properly.
  • SSD's: While I don't have any replacement SSD's for the pool, I did instead try to run the pool as a striped VDEV with all four SSD's. This did not lead to any appreciable difference in the performance figures. With the current RAID-Z2 pool, performance should theoretically be limited to the IOPS of a single SSD, with other performance figures actually performing better due to the number of drives. These theoretical RAID-Z2 performance figures are nowhere close to being met.
  • Memory: According to the TrueNAS dashboard, most of the memory is either free or being used as ZFS cache. While 8GB may not be a lot, it appears to be sufficient for this rather small pool (with a total usable storage capacity of around 3.5TB).
  • CPU: I do not have any drop in replacements for the CPU of my TrueNAS system or for the one of my main system. However, my TrueNAS system averages around 15% load during uploads and downloads, which is less then the 25% load I would expect in case of a continuous max load on a single thread. My main system averages around an 8% load for Windows Explorer, which is also less the potential 12.5% average in case of a max load of a single thread. Both systems should theoretically be able to use Receive Side Scaling (RSS) in order to utilize multiple threads, but I haven't seen any indication of it being used on this single connection. The workload is already divided between multiple threads (even with the onboard 1 gigabit connection), but the CPU load doesn't rise above single thread levels. I'm not sure if the "re" driver for the RTL8125 chip actually supports RSS for TrueNAS, but the chip itself is capable of using this feature.

Potential software bottlenecks
I'm currently using just three minor tweaks to the default TrueNAS settings:
  • I've turned off the compression of my SMB dataset. This seemed to result in a rather small performance improvement (less than 3%).
  • I've disabled the sync option of SMB dataset. This seemed to result in a rather small performance improvement (less than 3%).
  • I've added the "server multi channel support = yes" auxiliary parameter to my SMB service config. This did not seem to impact performance.
I've tried a bunch of other settings from different forum posts and tutorial, but none resulted in a clear performance improvement. I'm a bit of a TrueNAS and SMB(/Samba) noob, so I'm not really sure where it is that I would need to look for any true improvements.

Please help
As you can see, I've been unable to locate the bottleneck of my current setup. My current performance looks to be somewhat similar to what one might expect out of an HDD. It's as if the TurboWrite-cache of the SSD's isn't used at all, even though this feature appears to be enabled on all four SSD's when doing a check from the CLI.

As I mentioned at the beginning of my post: I'm at a complete loss here and don't know where to look anymore. Your help is greatly appreciated!
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I would look at the following - but I am not an expert here
1. Crappy NIC - use a proper Intel NIC
2. Not enough memory - TrueNAS likes ARC
3. Crappy SSD's (I really don't like QVO's) - but I suspect they are not your issue here and my dislike is at least a little irrational
4. Why do you have the Radeon card - the 2200G has onboard graphics - use that (unless there is a specific reason for the Radeon). That will release a slot for a decent NIC. You do seem to be PCIe slot challenged
5. Remember that SMB is not a fast multithreaded file protocol. Lots of little files will slow it down a lot
6. Turn compression back on - there is almost no case for turning it off

3rd/2nd/1st Gen AMD Ryzen™ Processors
1 x PCIe 3.0/2.0 x16 (x16 mode)
2nd and 1st Gen AMD Ryzen™ with Radeon™ Vega Graphics /7th Generation A-series/Athlon X4 Processors
1 x PCIe 3.0/2.0 x16 (x8 mode)
AMD Athlon™ with Radeon™ Vega Graphics Processors
1 x PCIe 3.0/2.0 x16 (x4 mode)
AMD A320 chipset
2 x PCIe 2.0 x1
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
For loads of small files, you're going to need to look in the direction of metadata aceleration.

You can do one or a combination (some of the following are incompatible/redundant, so doing them all isn't really sensible) of these things:

1. Add more RAM (compatible with everything).

2. Use an SSD for L2ARC configured on your datasets for metadata only (not beneficial to do together with 3)

3. Use a "special VDEV" for metadata on SSD

4. Tune your block/record size (https://openzfs.github.io/openzfs-d...uning/Workload Tuning.html#dataset-recordsize)

5. Turn off the metaslab allocator (https://openzfs.github.io/openzfs-d...uning/Workload Tuning.html#metaslab-allocator) since your pool is SSDs.

6. Use Mirrors instead (https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html#pool-geometry)... higher risk of pool failure, but with SSDs that's possibly a reasonable risk to take.
 

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
I would look at the following - but I am not an expert here
1. Crappy NIC - use a proper Intel NIC
2. Not enough memory - TrueNAS likes ARC
3. Crappy SSD's (I really don't like QVO's) - but I suspect they are not your issue here and my dislike is at least a little irrational
4. Why do you have the Radeon card - the 2200G has onboard graphics - use that (unless there is a specific reason for the Radeon). That will release a slot for a decent NIC. You do seem to be PCIe slot challenged
5. Remember that SMB is not a fast multithreaded file protocol. Lots of little files will slow it down a lot
6. Turn compression back on - there is almost no case for turning it off

Thanks for your suggestions! I tried looking at all of them, with the following results:
  1. Do you happen to have any suggestions as to what would qualify as a proper NIC? I checked both the onboard RTL8111H and the RTL8125 on the PCIe card again and the RTL8125 seems to have a decent feature set that should be comparable to a number of the Intel NIC's. Is there a specific feature or specification that I should be looking for in a NIC? Or are certain Intel NIC's just better supported by the built-in drivers in TrueNAS?
  2. I've added another 8GB of RAM for a total of 16GB, but it didn't seem to have any effect on performance, with unchanged results for all three of the metrics in my test scenario.
  3. I ran the following check to verify the performance of my SSD's:
    Code:
    fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1 
    This resulted in an average write speed of 135 MB/sec and an average IOPS of 33.0k in what should be an approximation of a worst case scenario. More reasonable checks show a sustained write speed of approximately 495 MB/sec in a 60 seconds and 29.9 GB test run. As such, the SSD's indeed don't seem to be an issue here.
  4. The 2200G does indeed have Vega graphics included. However, when using the onboard graphics, a portion (around 1GB) of the system RAM is reserved for the graphics, thus limiting the available RAM for TrueNAS even further. The RX570 definitely is an overkill, but at least it helps keep the full 8/16GB of RAM available for TrueNAS (and it's a graphics card I still had lying around).
  5. I'm afraid I can't do much about this one. The combination of Receive Side Scaling and SMB Multichannel should, as far as I'm aware, be able to make it (somewhat) multithreaded, but I haven't been able to get this working as of yet.
  6. I've tried it both on and off. Turning it off results in a very small performance improvement, but this seems to be within the margin of error on my test scenario.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Do you happen to have any suggestions as to what would qualify as a proper NIC? I checked both the onboard RTL8111H and the RTL8125 on the PCIe card again and the RTL8125 seems to have a decent feature set that should be comparable to a number of the Intel NIC's. Is there a specific feature or specification that I should be looking for in a NIC? Or are certain Intel NIC's just better supported by the built-in drivers in TrueNAS?
The feature set is, at best, massaged to make it look good. In practice, Realtek controllers have always been terrible, especially with drivers not written by Realtek, as is the case here.
 

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
For loads of small files, you're going to need to look in the direction of metadata aceleration.

You can do one or a combination (some of the following are incompatible/redundant, so doing them all isn't really sensible) of these things:

1. Add more RAM (compatible with everything).

2. Use an SSD for L2ARC configured on your datasets for metadata only (not beneficial to do together with 3)

3. Use a "special VDEV" for metadata on SSD

4. Tune your block/record size (https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html#dataset-recordsize)

5. Turn off the metaslab allocator (https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html#metaslab-allocator) since your pool is SSDs.

6. Use Mirrors instead (https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html#pool-geometry)... higher risk of pool failure, but with SSDs that's possibly a reasonable risk to take.

Thanks! I really appreciate the direction and learned a lot from the links you included!

I first tried option 1 by adding an additional 8GB, for a total of 16GB. Sadly, this didn't result in a change in performance. Nevertheless, I did keep the additional memory in the system while testing the other options.

Choosing between option 2 and 3, I decided to try option 3 next. I reinstalled TrueNAS on a WD Red 1TB HDD (connected via a USB 3.0 to SATA connection), thereby freeing up the 250GB Samsung 980 NVME M.2 drive for use as a metadata VDEV. At first, I hadn't set the "Metadata (Special) Small Block Size" setting on my SMB dataset, but setting it to 128KB and 512KB didn't help either, as this special VDEV didn't result in a performance improvement.

Options 4 (where I tried multiple, mostly smaller, block sizes) and 5 (where I set the SYSCTL "vfs.zfs.metaslab.lba_weighting_enabled" tunable to 0) had no effect on performance either.

I haven't had time to try option 6 yet. However, I did previously try to run a striped VDEV with all four SSD's and didn't see any performance improvements. As the mirror setup does have the potential to improve read speeds, I will try to test that one next. Nevertheless, I don't expect a mirror setup to help with write and delete speeds (especially when compared to a striped VDEV), so that probably still leaves me with at least a partial bottleneck (which would be better than the current situation).
 

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
The feature set is, at best, massaged to make it look good. In practice, Realtek controllers have always been terrible, especially with drivers not written by Realtek, as is the case here.

Thanks, I appreciate the explanation! I wasn't aware of that and will try to find an Intel NIC. I haven't been able to find a decently priced Intel one with 2.5gbps+ support as of yet (as the Intel I225 controller doesn't seem to be supported in the current version of TrueNAS), but I'll keep looking. I might try an Intel gigabit card as an alternative (perhaps based on the Intel I210 controller?), but any specific recommendations would definitely be appreciated!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thanks for your suggestions! I tried looking at all of them, with the following results:
  1. Do you happen to have any suggestions as to what would qualify as a proper NIC? I checked both the onboard RTL8111H and the RTL8125 on the PCIe card again and the RTL8125 seems to have a decent feature set that should be comparable to a number of the Intel NIC's. Is there a specific feature or specification that I should be looking for in a NIC? Or are certain Intel NIC's just better supported by the built-in drivers in TrueNAS?

We generally advise against wish-buying hardware like this and then hoping it works well. In practice, the needs of a NAS are rather particular, and the stresses a NAS places on the network card can be quite demanding.

The Realtek cards are powered by a little hamster (maybe two) running in a spinny wheel to generate movement of your bits, and even on legitimate Realtek silicon, tends to be poorly supported because of the lack of official Realtek support for drivers other than Windows. This is made more challenging because there is a healthy market for Realtek knockoffs in the back alleys of Shenzhen, most of which do not work as well because they may be early versions of the Realtek part, and these have been known to make it onto commonly available boards. All of this conspires against you and often results in performance problems, random hangs, and other various issues.

It is much better to start off with the FreeNAS Hardware Recommendations guide, and find a way to move forward from there.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Basically Chelsio / Intel Server Grade NIC's are your best options.
Buy from a system dismantler rather than new.

What country are you based in?

I would still ditch the graphics card and keep the PCIe slot for useful stuff. In particular now you have 16GB. Not that this will directly help your problem - but you will need a decent slot for the NIC
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Have you run an iperf test between the client and server?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
250GB Samsung 980 NVME M.2 drive for use as a metadata VDEV.
Not sure it was obvious, but in doing that you now have a single point of failure for your pool... metadata VDEVs are pool integral and should have the same redundancy as you expect your pool to have (at very least, mirrored).

If your system won't acommodate that, you may be better off going with the metadata only L2ARC, which is then not integral to the pool, so can be removed/lost at any time with no impact on data.

You need to warm the cache, so consider doing that by running a recursive ls or find or something after a startup. https://www.truenas.com/community/threads/impact-of-svdev-on-rsync-vs-l2arc.93371/



this special VDEV didn't result in a performance improvement
Did you do anything to cause the metadata to be re-written (copy/duplicate the dataset and delete the original and rename)? If you didn't, there's no surprise you saw no difference as the metadata remains on the pool disks where it was before.

SYSCTL "vfs.zfs.metaslab.lba_weighting_enabled" tunable to 0) had no effect on performance either.
Not sure if that needs to be before any data is written to the drive, set with a tunable and after a reboot... not sure how much of that you did.

I would suggest re-thinking a couple of those and indeed going down the mirror path when you have the time.
 

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
The Realtek cards are powered by a little hamster (maybe two) running in a spinny wheel to generate movement of your bits, and even on legitimate Realtek silicon, tends to be poorly supported because of the lack of official Realtek support for drivers other than Windows. This is made more challenging because there is a healthy market for Realtek knockoffs in the back alleys of Shenzhen, most of which do not work as well because they may be early versions of the Realtek part, and these have been known to make it onto commonly available boards. All of this conspires against you and often results in performance problems, random hangs, and other various issues.

Basically Chelsio / Intel Server Grade NIC's are your best options.
Buy from a system dismantler rather than new.

What country are you based in?

Good to know! I should be able to do another test with an Intel I210 card from HP on Wednesday. It's only a gigabit connection, but I'm currently far from saturating 1 gigabit in bandwidth anyway, so let's hope that this card still leads to an improvement. I live in the Netherlands and have yet to find a good (trustworthy) reseller of used server hardware, so this will have to do as a starting point while I keep looking for other options.

I'm still considering the purchase of the Supermicro A2SDi-H-TF motherboard (with the Intel Atom C3758 processor) as a longer term solution. That motherboard and CPU should be similar to the TrueNAS Mini Series and such a setup is probably better suited for the task at hand.

Have you run an iperf test between the client and server?

I have now ;)

I first ran the following test (with my TrueNAS system as the server and my main system as the client):
Code:
.\iperf.exe -c 192.168.x.x
This test ran for 10 seconds and transferred 651 MB with a bandwidth of 545 Mbps.

I then ran the following test to better push my connection to its limits:
Code:
.\iperf.exe -c 192.168.x.x -P 8 -t 30 -w 32768
This test ran for 30 seconds and transferred a total of 8.19 GB with a bandwidth of 2.34 Gbps. That result matches closely with the advertised bandwidth of 2.5 gbps.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
When I added a special vdev to my main pool I used a script to copy each dataset to a new dataset, then destroy the source dataset and then rename the new dataset to the old dataset. I used snapshots. I used this (several times) to tune the special vdev to where I wanted it to go (%use)

Just adding the vdev and setting the parameters does nothing - you need to (in some manner) churn the data so it gets rewritten to the pool

Have a look at: https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954
This explains the process.

Sample Script:
Code:
zfs snap BigPool/SMB/Archive@migrate
zfs send -R BigPool/SMB/Archive@migrate | zfs recv -F BigPool/SMB/Archive_New
zfs snap BigPool/SMB/Archive@migrate2
zfs send -i @migrate BigPool/SMB/Archive@migrate2 | zfs recv -F BigPool/SMB/Archive_New
zfs destroy -rf BigPool/SMB/Archive
zfs rename -f BigPool/SMB/Archive_New BigPool/SMB/Archive


The iperf results are as if you are writing, what about the other way around?

I use https://www.bargainhardware.co.uk/ for second hand kit - though its UK based which is likely to be an issue. You ought to be able to find someone on ebay - just not in China and having lots of older kit for sale as well
 

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
Apologies for my late reply. It's taken a bit longer than I hoped before I was able to do some additional testing again.

Not sure it was obvious, but in doing that you now have a single point of failure for your pool... metadata VDEVs are pool integral and should have the same redundancy as you expect your pool to have (at very least, mirrored).

Yes, thanks! The warnings in TrueNAS are quite clear in that regard as well, so it's actually quite difficult to oversee. I wouldn't consider this setup for actual deployment, but it's fine for my current trials.

Did you do anything to cause the metadata to be re-written (copy/duplicate the dataset and delete the original and rename)? If you didn't, there's no surprise you saw no difference as the metadata remains on the pool disks where it was before.
Just adding the vdev and setting the parameters does nothing - you need to (in some manner) churn the data so it gets rewritten to the pool

I do all my tests on an empty pool and often reinstall the entire operating system after messing around too much with the settings. As such, I always run my test scenario in chronological order, where I start with a write of my sample project followed by a read and delete. This should have ensured that the metadata actually got written to the special VDEV. If, however, there is more that I should do to unlock the potential of such a VDEV, please let me know!

Not sure if that needs to be before any data is written to the drive, set with a tunable and after a reboot... not sure how much of that you did.

I set the tunable after installing a fresh copy of TrueNAS Core. I then rebooted the system prior to setting up the pool and SMB share.

I would suggest re-thinking a couple of those and indeed going down the mirror path when you have the time.

I went ahead and tried two different pool setups:
  • A single VDEV with a mirror configuration consisting out of all four SSD's.
  • A dual VDEV setup, where both VDEV's consisted out of two SSD's in a mirror configuration.
Both setups resulted in a decreased write performance, equal or decreased read performance and roughly equal delete performance compared to the RAID-Z2 configuration. This further strengthens my suspicion that my drive setup is not my current bottleneck. It feels more as if I'm being limited by a maximum number of file transactions per second, perhaps caused by the use of the SMB protocol. I'm open to trying any viable alternative to SMB, but despite all of its limitations, it still seems to be the go-to standard for file sharing with Windows systems.

After testing the different pool setups, I tried switching my NIC and replaced the RTL8125 with an Intel I210 NIC with a single gigabit connection. This did not result in any performance changes.

After installing the Intel NIC, I decided to delve a bit deeper into Receive Side Scaling. To start with, I ran the following command om my Windows system:
Code:
Get-SmbMultichannelConnection -IncludeNotSelected

This listed the connection via the Intel NIC as being uncapable of RSS. As this contradicted the specification sheet, I decided to delve a bit deeper and eventually figured out that I had to configure the NIC for its use. I now have the following Auxiliary Parameters for my SMB service:
Code:
server multi channel support = yes
aio max threads = 100
allocation roundup size = 1048576
interfaces = "192.168.x.x;capability=RSS,speed=1000000000"
read raw = Yes
write raw = Yes
socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=131072 SO_SNDBUF=131072
min receivefile size = 16384
use sendfile = true
aio read size = 1
aio write size = 1

Additionally, I now have the following Auxiliary Parameters set for the SMB share itself:
Code:
aio read size = 1
aio write size = 1


The Get-SmbMultichannelConnection command now returns the connection to the TrueNAS system as being RSS capable. However, the CPU load appears to remain limited to around a maximum of 40-45% of a single thread. The thread that gets used by the TrueNAS system appears to be more consistent than before though. Is there anything else that I should set/enable in order to make SMB somewhat more multithreaded over a single connection? I know that multiple physical connections are the more standard use case of SMB Multichannel and Receive Side Scaling, but from what I've read about it, single connections should normally be able to benefit from it as well.

The iperf results are as if you are writing, what about the other way around?

With the RTL8125 both tests resulted in a bandwidth between 2.30 and 2.35 gbps, which is in line with the specification of the card. With the Intel I210 NIC, the bandwidth is limited to just shy of 1 gbps.
 

Neville005

Dabbler
Joined
Sep 18, 2021
Messages
10
One more addition to my previous post: I've removed the RX570 GPU from the system and moved the Intel NIC to the PCIe 3.0 x16 slot that should have a direct connection to the CPU. This does mean that some memory gets reserved for the onboard graphics of the CPU, but that still leaves 14 GB available for TrueNAS and should ensure an unconstrained PCIe connection for my NIC. This move (again) did not have any performance impact.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
@anodos could say definitively, but I believe Samba is limited to a single CPU thread. This sounds like it could be a bad interaction between the vdev ashift and the actual physical block size of the SSDs.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
@anodos could say definitively, but I believe Samba is limited to a single CPU thread. This sounds like it could be a bad interaction between the vdev ashift and the actual physical block size of the SSDs.
Depends on context and configuration. In TN 12 samba will use aio(4) to use kernel threads for reads and writes. SCALE uses io_uring (Linux).
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
According to https://redmine.ixsystems.com/issues/24210, use sendfile = no is more performant, but I don't know if that's still the case with TN 12's OpenZFS implementation, as that link was for FN 11.2.
Sendfile is not recommended, hence it is off by default and there is no supported method to enable it. Generally speaking, auxiliary parameters are an unsupported configuration. Things may work, but results are not guaranteed and we will most likely not troubleshoot bugs resulting from using them.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Well I have no idea I am afraid.
 
Top