Low performance in large memory systems

Joined
Feb 1, 2024
Messages
8
Hello community, this is my first post and I apologize if I chose the wrong category or left out any information.

I have a HDD array, 10Gbit NIC and 512GB of RAM. I believe most of the RAM will be used as ZFS cache, but they don't seem to be working.

If I understand correctly, when I copy a large file of several hundred gigabytes from Windows to the pool via SMB, I'll have it done in a few minutes at around 1000MB/s. Truenas will slowly write the data from RAM to the HDD array after this.
But in reality, I would only get around 1000MB/s for the first 1-2 seconds, then drop to 100-200MB/s. It looked like the data was being written directly to the HDD without going through the cache. The initial high speed also looks like it's only coming from the HDD's own cache, not the ZFS cache.

The MTU on both the Truenas and my Windows device is set to 9000 - no different than setting it to 1500.
I'm not sure if the problem is with the Truenas, the network, or a limitation of the SMB protocol itself? If anyone can offer perspective, that would be great.
Sorry for my bad English, thanks in advance for any help.
 
Joined
Feb 1, 2024
Messages
8
1706809770226.png
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Hello @OnlyLoveElaina

ZFS will collect a certain amount of pending data into a transaction, but will not hold an unlimited amount in RAM (as it would be lost on power off) so eventually you will end up limited by the speed of your storage pool.

Can you please describe your hardware in detail, specifically around your storage controller type, hard drive model, drive count, and the layout? The Storage Dashboard in TrueNAS SCALE should indicate most of the latter, eg:

1706812287574.png
 
Joined
Feb 1, 2024
Messages
8
Hello @OnlyLoveElaina

ZFS will collect a certain amount of pending data into a transaction, but will not hold an unlimited amount in RAM (as it would be lost on power off) so eventually you will end up limited by the speed of your storage pool.

Can you please describe your hardware in detail, specifically around your storage controller type, hard drive model, drive count, and the layout? The Storage Dashboard in TrueNAS SCALE should indicate most of the latter, eg:

View attachment 75274
Hi, @HoneyBadger. thank you for the reply.
My TrueNAS is on an Inspur NF5280M3 server. The controller is from chipset (The hard disk is connected to the SATA connector on the motherboard). All 6 drives are WD HC320 8TB.
1706812901786.png

I hope this information will be helpful and thanks again for your reply.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Your server appears to not use any RAID logic on the integrated controller, which is good - and your drives are not SMR (shingled) so the throughput numbers will not be hampered by that.

The pool layout of 8 wide Z2 means that you will get the effective "write bandwidth" of four drives, but I would think that you should be able to get more than 100-200MB/s from six HDDs.

Can you please run the following command against your HDDs to ensure that the disk write caching is enabled:

for n in {a..h}; do sdparm -g WCE "/dev/sd"$n""; done

This will query drives sda through sdh and ask for the "Write Cache Enabled" value.
 
Last edited:

somethingweird

Contributor
Joined
Jan 27, 2022
Messages
183
Just wondering if your running deduplication?
 

somethingweird

Contributor
Joined
Jan 27, 2022
Messages
183
Joined
Feb 1, 2024
Messages
8
Your server appears to not use any RAID logic on the integrated controller, which is good - and your drives are not SMR (shingled) so the throughput numbers will not be hampered by that.

The pool layout of 6 wide Z2 means that you will get the effective "write bandwidth" of four drives, but I would think that you should be able to get more than 100-200MB/s from four HDDs.

Can you please run the following command against your HDDs to ensure that the disk write caching is enabled:

for n in {a..h}; do sdparm -g WCE "/dev/sd"$n""; done

This will query drives sda through sdh and ask for the "Write Cache Enabled" value.
Sorry, I'm not very familiar with FreeBSD. When I run the command I get Permission denied. I think it's because I'm logging in with admin instead of root (I chose to create an admin account when installing TrueNAS). The password for the root account seems to be disabled, do I need to enable the password for root and run the su command?
Thanks for your patience.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Sorry, I'm not very familiar with FreeBSD. When I run the command I get Permission denied. I think it's because I'm logging in with admin instead of root (I chose to create an admin account when installing TrueNAS). The password for the root account seems to be disabled, do I need to enable the password for root and run the su command?
Thanks for your patience.
I keep forgetting that because of the CORE -> SCALE upgrade on several of my units I operate as root by default.

Run sudo -s first and then the command.

No, both deduplication and Atime are disabled. Maybe the "Synchronize" option will have an effect?
What is the current "sync" setting on the dataset? It should default to "standard" and will only be slowed if you set "always"
 
Joined
Feb 1, 2024
Messages
8
I keep forgetting that because of the CORE -> SCALE upgrade on several of my units I operate as root by default.

Run sudo -s first and then the command.


What is the current "sync" setting on the dataset? It should default to "standard" and will only be slowed if you set "always"
Thank you, here are the results, I think caching is enabled.
1706960889034.png

Regarding sync, I haven't changed it, so it's "standard".
 
Joined
Feb 1, 2024
Messages
8
The magic happened, I tried reinstalling TrueNAS yesterday and I suddenly got 500MB/s write speeds on SMB ......
I think that's a reasonable speed for 4 HDDs, although the large RAM didn't produce the results I was hoping for.
BTW, I added three 1 TB Nvme SSDs when I reinstalled TrueNAS, and I plan to put the application datasets somewhere faster. Would you recommend that I create a RAID Z1 with these three SSDs and put the application datasets in it or use them as cache for the HDD array?
In any case, thanks for your help so far.
That does show WCE:1 (Write Caching Enabled is TRUE) so that part should be sorted. We might need to dig into some fio benchmarks to check write performance.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I need to ask the web admin to add the "thinking emoji" so I can give the proper puzzled reaction to the fact that a reinstall made it faster.

500MB/s is much more along the lines of what I would expect to see from that array and configuration. With regards to the RAM, it is possible to increase the amount of "pending dirty data" that's allowed to queue up in RAM, but that comes with the increased risk of that data being lost in the event of an unexpected power loss or component failure.

If you're planning to make use of applications, I'd suggest that the 3x 1T SSD's be put to use there. With 512GB of RAM you should have sufficient read cache, and since your workload involves SMB copies you would not benefit from the use of SLOG to act as a "safe write cache"
 
Joined
Feb 1, 2024
Messages
8
I need to ask the web admin to add the "thinking emoji" so I can give the proper puzzled reaction to the fact that a reinstall made it faster.

500MB/s is much more along the lines of what I would expect to see from that array and configuration. With regards to the RAM, it is possible to increase the amount of "pending dirty data" that's allowed to queue up in RAM, but that comes with the increased risk of that data being lost in the event of an unexpected power loss or component failure.

If you're planning to make use of applications, I'd suggest that the 3x 1T SSD's be put to use there. With 512GB of RAM you should have sufficient read cache, and since your workload involves SMB copies you would not benefit from the use of SLOG to act as a "safe write cache"
Maybe I screwed up something in the system? But the previous system was also installed less than a week ago, and I didn't change too many settings.
In any case, now that the problem is gone, I'm not going to delve into why.
I'll take your advice and create a dedicated application pool with SSDs, thanks for your help.
 
Top