File Transfers Maxed at 1GB

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
I’m trying to diagnose my bottleneck with SMB file transfers. I’m running Trunas 13.0.-U5.3 on a Dell R830, 4 x Xeon e5 4660 v4’s totalling 64 cores & 128 threads with 1TB of ram. For testing purposes I have 15x 512GB SSD’s running in stripe. It also has a 2TB metadata vdev, 2TB L2arc, and 256GB Slog. These are all running on gen 3 NVMe on their own PCIe card.

I’m connecting to it from 2 different PC’s, having the same results on both, but the most powerful is running a threadripper pro 5995wx, 64 cores, 128 threads, also with 1TB or ram running windows 11 pro for workstations. Tested to and from Gen 3 and Gen 4 NVMe drives capable of 7GB speeds.

Each device has 2 x dual 25Gb Mellanox Connectx 4 Nic’s. Logic here was the dell servers only have PCIe 3.0 so knew there would be a bottleneck there for 100Gb/12.5GB throughout.

I’ve tried aggregating the 4 x 25Gb connections together to create 100Gb connections on both windows, Trunas and through a 25Gb switch and also running a single 25Gb connection directly from machine to machine.

Each test has only got me the same 1GB write speeds and around 400Mb read speeds.

Any input would be most welcome.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
How are you testing?
Did you read the following resource?
 
Last edited:

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
My basic test is file transfers as this is what users will be doing to start. Scrubbing through video files within premier pro will be next but 10Gb speeds doesn’t seem right. I’ve asked my network guy to review the “tuning” link you sent so appreciate that.

We’ve bumped MTU on both sides from 1500 to 9216 which has improved file transfer speeds. Is there any negatives to this.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Which PCIe to M.2 adapter are you using? What is the model of the drives used as SLOG? How is the dataset propriety [in the tested dataset] relative to sync writes (the ones committed to the SLOG; the propiety is called sync), is it standard or always?
How big is the dataset's recordsize? How big are the files? How many are the files? Which protocol are you using for those benchmarks, SMB?

I find it really strange that your write speed is bigger than your read spead, usually it's the other way around (thanks to ARC); are we talking about writing to the NAS or from the NAS [to the clients]?

Are you actually able to reach the 10Gbps [or whatever your network speed is] from the NAS? Did you try a iperf test?

I also suggest you the following resource.
 

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
Thanks for this.

After adding vdevs I left everything as standard. I deleted the pool and recreated the pool using only the SSD’s, again in stripe, this was connected directly to the PC and the results were exactly the same.

Protocol is SMB, files are movies and tv series tests of 4-400 in number. Write speeds to the Nas are faster than read speeds. Even transferring the same files back and forth to use ram doesn’t appear to change things, suggesting drive speeds, pool vdev setup, switch etc are not the bottleneck.
 

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
IMG_6166.jpeg
IMG_6165.jpeg

Which PCIe to M.2 adapter are you using? What is the model of the drives used as SLOG? How is the dataset propriety [in the tested dataset] relative to sync writes (the ones committed to the SLOG; the propiety is called sync), is it standard or always?
How big is the dataset's recordsize? How big are the files? How many are the files? Which protocol are you using for those benchmarks, SMB?

I find it really strange that your write speed is bigger than your read spead, usually it's the other way around (thanks to ARC); are we talking about writing to the NAS or from the NAS [to the clients]?

Are you actually able to reach the 10Gbps [or whatever your network speed is] from the NAS? Did you try a iperf test?

I also suggest you the following resource.

Thanks for this.

After adding vdevs I left everything as standard. I deleted the pool and recreated the pool using only the SSD’s, again in stripe, this was connected directly to the PC and the results were exactly the same.

Protocol is SMB, files are movies and tv series tests of 4-400 in number. Write speeds to the Nas are faster than read speeds. Even transferring the same files back and forth to use ram doesn’t appear to change things, suggesting drive speeds, pool vdev setup, switch etc are not the bottleneck.
Not familiar with iperf but I just followed a tutorial and ran it between the two PC’s (could figure out how to on Trunas) which have the exact same networking setup through a 25Gb switch and these are the results I got. In addition I’ve attached the results running solar winds wan killer:(warning before running the program stating it wouldn’t be able to generate packets that quickly, suggesting it can’t produce enough data to fill the 100Gb)
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Can you try with a mirrored VDEV (two or there drives will suffice) to see if the result is different?

It could be worth looking into the ARC with the arc_summary command after a read test.

We can only do guesswork without informations.

EDIT: Just looked over again the 10Gbps Network Primer (link below) because the Mellanox rung a bell: the ConnectX-2 cards are strongly discouraged... I think your networking cards might just not be compatibile with TN (CORE at the very least).

EDIT2: grammar corrections
 
Last edited:

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
Not familiar with iperf but I just followed a tutorial and ran it between the two PC’s (could figure out how to on Trunas) which have the exact same networking setup through a 25Gb switch and these are the results I got. In addition I’ve attached the results running solar winds wan killer:(warning before running the program stating it wouldn’t be able to generate packets that quickly, suggesting it can’t produce enough data to fill the 100Gb)
IMG_6168.jpeg

IMG_6168.jpeg
IMG_6169.jpeg
Not sure why Task manager is showing 100% at 10Gb but this correlates with my 1.2GB transfers.
 

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
Can you try with a mirrored VDEV (two or there drives will suffice) to see if the result is different?

It could be worth looking into the ARC with the arc_summary command after a read test.

We can only do guesswork without informations.

EDIT: Just looked over again the 10Gbps Network Primer (link below) because the Mellanox ringed a bell: the ConnectX-2 cards, which are NOT your own, are heavily discouraged. Maybe it's related.
I could, but imagine a 2 ssd mirror would half my current 1GB write speed.
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
Try some FIO tests from your source to your TrueNAS to get real solid numbers?
I used the ones noted here: https://askubuntu.com/a/991311

You can also run those FIO tests locally on TrueNAS to eliminate network and get a base idea of how fast your Pools are. Just SSH into TrueNAS and run fio , I beleive iperf3 is also default installed in TrueNAS

Have you also done your iperf tests to see if you can saturate your links both ways between TrueNAS and your devices??

Aswell run with multiple instances for each test (vs 1 single instance as that often can not max out bandwidth or I/O)

Mirrors can increase read speed and increase IOPS at the cost of less storage space.

Lastly, monitor your CPU usage when doing the transfers on TrueNAS, see if any single cores are maxing out at 100%....

Also, this should of been posted in the TrueNAS CORE forums, not in the FreeNAS legacy. @HoneyBadger are we able to move this thread to the right section in the forums?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Pretty sure the issue is of networking nature, possibily a configuration or hardware incompatibility issue.
Sadly I am no networking expert (don't ask me about subnets, I know nothing!), and these kind of speeds require even more specialized knowledge that I do not possess.

@jgreco and other users with real hands-on experience might give you a solution.
 

Mascomatt

Cadet
Joined
Oct 26, 2023
Messages
7
Try some FIO tests from your source to your TrueNAS to get real solid numbers?
I used the ones noted here: https://askubuntu.com/a/991311

You can also run those FIO tests locally on TrueNAS to eliminate network and get a base idea of how fast your Pools are. Just SSH into TrueNAS and run fio , I beleive iperf3 is also default installed in TrueNAS

Have you also done your iperf tests to see if you can saturate your links both ways between TrueNAS and your devices??

Aswell run with multiple instances for each test (vs 1 single instance as that often can not max out bandwidth or I/O)

Mirrors can increase read speed and increase IOPS at the cost of less storage space.

Lastly, monitor your CPU usage when doing the transfers on TrueNAS, see if any single cores are maxing out at 100%....

Also, this should have been posted in the TrueNAS CORE forums, not in the FreeNAS legacy. @HoneyBadger are we able to move this thread to the right section in the forums?
I’ve ran multiple instances of wan killer and maxed out at 20Gb. If I add more instances it slows down. Other resources aren’t near max either.
IMG_6171.jpeg
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I'm not a networking guru either but do you have the ability to boot up say an Ubuntu Live CD/DVD/Image? I would boot up both the NAS and the Widoze computer with Ubuntu. Test throughput with that and see if the network speeds up. If it does, then do the test again with the Windoze computer running Windoze, if that works fine then you know it's probably a setting in TrueNAS. But I do not have those high speed connections so I'm just thinking of a way to narrow down the bottleneck. Maybe someone who have those NIC's can comment on this thread, or you may need to do some searching for the answer.

Best of luck to you.
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
I’ve ran multiple instances of wan killer and maxed out at 20Gb. If I add more instances it slows down. Other resources aren’t near max either. View attachment 71721
Run FIO on your truenas to get true speeds of your pools to see if there is a bottleneck there.
 

asap2go

Patron
Joined
Jun 11, 2023
Messages
228
Someone here had similar issues:

Maybe you can get some info out of it for your case.
I also believe testing with Ubuntu on client and server side should give you a good idea of what is possible. I had the same issues with my windows 10 not being able to max out a measly 10GbE connection. While Ubuntu got better results out of the box.
 
Top