Performance Troubleshooting Recommendations

Status
Not open for further replies.

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
tldr; in bold

Hello, I'm running a FreeNAS MiniXL for a small company that works primarily on large image files (up to ~12GB). It's a Mac environment, so everyone is on AFP/Netatalk. There are no more than 12 active connections at a time. The pool (tank) is 33% full.

The config is as follows:

FreeNAS-9.10.2-U5 (561f0d7a1)

32GB RAM

8 x 6TB WD Red – RAID-Z2

2 x Micron M600 SSD – L2Arc

10GbE SFP+

After having it in production for a while, I'm getting performance complaints. Running synthetic loads over AFP in Aja Disk Test, I'm seeing slower than expected reads. It saturates about 1 1GbE connection. With the RAM, L2ARC and 10GbE, I was expecting to see at least 3 1GbE lines saturated (which I am seeing for writes), at least until it runs out of cache. It's as if it's not hitting the ARC or L2ARC at all. I did see a small difference between having 1 or 2 SSD's in the L2ARC. When I run dd speed tests with cache on, I get astronomically fast speeds as expected.

The network is all Ubiquiti, and I'm running identical network hardware/config elsewhere with a Linux fileserver, and I'm not having any issues.

I am currently considering:

1. Increasing RAM to 64GB.
2. Changing RAID-Z2 config to striped mirrors at the cost of some capacity.


These involve considerable expense or time respectively, so I'd like to get a recommendation on what to try first (or to do something else entirely).

Any pointers on further testing? I know there are a lot of things in play here.

I am relatively new to FreeBSD and ZFS. I have read up a lot on ZFS, but I only have 6-8 months hands-on experience with it. My background is primarily Linux and Apple.

Thanks in advance for any help. I put this in the Build Discussion because it primarily deals with hardware config, but if there's a more appropriate place for it, please let me know and I will repost.
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
Thanks m0nkey_, that is definitely true. I believe they even upgraded to smb3 since then, but in my experience, it is still very buggy, not only when using Samba, but even from a Mac server.

That said, if anyone has a Samba config that they have working solidly with macOS, I'd love to see it. I have experimented some with the vfs_fruit settings, but I was still getting abysmal performance and bugginess. Across the board, the main issue with SMB on macOS has been that random files and folders will simply stop appearing in Finder. I've seen this occurring on Windows, Samba and macOS hosted shares since smb2 was introduced in Mavericks. It was still happening a few months ago.

Quick novice question about ARC/L2ARC: Are files or blocks being cached? Is there a file size limit for what will allowed into the cache?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
It's as if it's not hitting the ARC or L2ARC at all.
Remember that the ARC and L2ARC are for data which is read frequently and you may not see this during a synthetic test. You would be at the real read speed of the drives. And if the users are reading 12GB files all the time and they are not the same file, well those files will not get cached in advance. And 32GB of RAM is pretty light for running an L2ARC.

The solution might be a little more difficult. While you could add more RAM (and I think you should regardless), I'm not possitive that would solve your issue. I really like @m0nkey_ suggestion. I also think you shoudl run a test using SMB to see if the data flows as it should, if it doesn't then more testing is required to find the bottleneck.
 

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
Thanks joeschmuck, that makes sense why it wouldn't hit the cache and also why performance increased in synthetics immediately after I added the 2nd SSD to the L2ARC. I assume it's more likely to hit the L2ARC if there is a bunch of free space on it.

If doubling the RAM is a recommendation regardless, then I'd rather do that first. Reformatting to striped mirrors and restoring from backup will take most of a weekend, so I was hoping to try that second anyway. It's funny that you say 32GB of RAM is light for L2ARC. It's the maximum memory config for the FreeNAS MiniXL, and they're happy to sell you an L2ARC with it.

If the RAM doesn't fix it, I'll switch to mirrors. That will at least increase the raw array speed.

As for SMB, it's on my list of things to try. I haven't found a way to reliably recreate the file visibility issue in macOS, so troubleshooting is difficult. Sometimes it takes over a week to manifest, sometimes it happens in 5 minutes. Samba also has a lot of configuration options which doesn't help. I have Netatalk working respectably on Linux, so unless there is a FreeBSD-specific issue with it, I'm hesitant to place the blame there.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
My troubleshooting advice is:

1) Get some machine other than a Mac and test the throughput of the system. I'd do this even before adding more RAM.
2) If SMB is fast as you belive the system should be then you need to figure out the protocol issue. Adding more hardware will not fix it.
3) If SMB is also slow then you should run an internal throughput test using dd, just look for something like "freenas dd benchmark' and it should get you pretty close. You need to ensure the drive througput is fast enough. Knowing this data could save you the hassle of rebuilding it to mirrors.

You cannot just toss stuff at a server and cross your fingers that it will work. I think a FreeNAS Mini is a bit light of CPU horsepower but until you perform the tests you will not know if it will do what you desire.

and they're happy to sell you an L2ARC with it.
It's a business for profit, you bet they will. If you had the right use case then 32GB RAM and an L2ARC would be fine, something like lots of small files that everyone used frequently or like a database that everyone used all the time. But pulling the occasional large file, like a movie, it'a waste of resources.

Also, to test SMB you could set up a new dataset just for testing and share a directory via SMB and do all your testing that way without disturbing the AFP stuff. This would get you some good data.

And yes, 32GB is the minimum recommended RAM before using an L2ARC but best practice it to put as much RAM in your system as possible before adding an L2ARC because RAM is faster. If this is for an office I would expand the RAM to 64GB if possible, but if 32GB is the limit then that is where you are.

Also, do you leave the system running all the time or reboot it often? My advice is to leave it running all the time as this will maintain the L2ARC, assuming it's helping at all in the first place.

So do the testing and write down your results. Hopefully you will find the solution.

Good Luck!
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
The config is as follows:

FreeNAS-9.10.2-U5 (561f0d7a1)

Do you have any spare hardware? You may want to try setting up a machine running a FreeBSD 11 kernel before you spend any money.
What you're describing is something we've experienced and there's a significant improvement in 10Gb network performance with the new FreeBSD 11 based FreeNAS kernels.

In FreeNAS 9.10, Our default build which sports an E5-1620 v2 @ 3.70GHz could barely get out of it's own way when serving AFP, SMB or NFS over Intel X520 and SolarFlare 10G cards. The best single flow throughput we see is around 1.5-2Gb/s and multi flow throughput would top out around 3Gb/s.
With (the version who shall not be named) and FreeNAS 11, we can drive NFS and AFP at line rate on the 10G interfaces.

I can't tell you what the fix is. I haven't dug into the code to see, but there are a lot of bugfixes in the kernel and ethernet drivers for performance issues, so something seems to have been cleaned up since the kernels FreeNAS 9.10 is based upon..

I don't have any 10G connected MACs, but I'm confident you'll see the same performance with AFP as you do with NFS.. an example:

Local on the NAS:
Code:
root@nas1:/mnt/vol1/vm # dd if=ubuntu-16.10-server-amd64.iso of=/dev/null bs=1m
668+0 records in
668+0 records out
700448768 bytes transferred in 0.088618 secs (7904157635 bytes/sec)


and from one of our cluster vms:
Code:
root@vm45:/mnt# dd if=ubuntu-16.10-server-amd64.iso of=/dev/null
1368064+0 records in
1368064+0 records out
700448768 bytes (700 MB, 668 MiB) copied, 0.944598 s, 742 MB/s
root@vm45:/mnt#


not the most scientific test, the file was probably in RAM, but we're interested in network performance and read speeds in excess of 7Gb/s over a 10G link is a lot better than 1.5.

Our 1G Macs can drive the AFP server at line rate just fine.. (As an aside, we've experimented with SMB over AFP for macs as well and have found it.... lacking.)
 
Last edited:

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
joeschmuck

dd – I disabled compression and ran the dd tests. Write was 570MB/s. The read was insane as before (2.2GB/s) which I assume was because it was in ARC, so I rebooted and saw 533MB/s. Looks like the raw speeds are good.

rebooting – I don't have a scheduled reboot. It's usually just when I run updates.

smb testing – A default SMB share is giving me 20MB/s on reads and writes in macOS... so terrible as was my previous experience with Samba and Apple. I'm sure I could improve this by messing with the config (SMB3 is probably the main culprit), but I don't have time to do that at the moment.

On the one Windows machine they have, I'm seeing normal speeds over SMB. The problem is that I know I can saturate a single 1GbE line. I'd need multiple Windows machines to properly test Samba against Netatalk. Unfortunately, they only have one in the office.

RAM – I am working on sourcing some 16GB ECC UDIMMs so we can go to 64GB since that has been recommended across the board, even if it doesn't completely solve the read performance issue. The officially supported 16GB ECC UDIMMs for the ASRock board might as well not exist, so I'm going to have to get something third-party.


c32767a

I was planning to upgrade to FreeNAS 11 once U2 comes out, but maybe I'll go for U1.

The SFP+ card was in the back of my mind. I'm still waiting to get an official transceiver support list from IXSystems (I pulled the card out and couldn't even figure out what it is). I'm currently using a passive copper cable I already had which appeared to work fine, but who knows. Since the write speeds are good, I'm hesitant to blame the NIC, but I suppose TX traffic could suffer while RX is unaffected if SFP+ isn't happy. Not an issue I've had before though.

Looking at top, I'm seeing afpd eat about 30% of a single thread during a synthetic test. The processor has 8 cores, and the only other process doing much of anything is python which is using about 12% of 2 threads.


The guys over at 45 Drives are hitting insane speeds over AFP on FreeNAS (albeit with jumbo frames turned on), so I have hope for Netatalk.

For now, I am going to get the RAM, a verified SFP+ transceiver and see how that goes. After that, I'll try to track down some Windows machines or spin up some Windows VMs in VMWare Fusion and run a better SMB test. If I still can't make sense of it, I'll move to FreeNAS 11 earlier than planned.


Thanks for your help! That should keep me busy for a bit.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Could you post some of the results from your dd tests? Have any example numbers for what slowness means to you? Are all these clients using gigabit ethernet or are some wireless?

When running dd tests on freenas you need to use a dataset that has compression disabled otherwise the results are invalid.

An example dd test would be
Write:
Code:
dd if=/dev/zero of=/mnt/pool/dataset/100G.dat bs=1M count=100000

Read:
Code:
dd of=/dev/null if=/mnt/pool/dataset/100G.dat bs=1M count=100000
 

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
Could you post some of the results from your dd tests? Have any example numbers for what slowness means to you? Are all these clients using gigabit ethernet or are some wireless?

When running dd tests on freenas you need to use a dataset that has compression disabled otherwise the results are invalid.

An example dd test would be
Write:
Code:
dd if=/dev/zero of=/mnt/pool/dataset/100G.dat bs=1M count=100000

Read:
Code:
dd of=/dev/null if=/mnt/pool/dataset/100G.dat bs=1M count=100000

Everyone is wired. Cat6, no LAGs or anything like that. I disabled compression for the dd tests. In synthetic incompressible loads over AFP, I'm seeing 150-200MB/s reads while getting almost 400MB/s writes. Tests are performed from multiple 1GB connections simultaneously.

My dd commands on the FreeNAS server looked like this:
Code:
dd if=/dev/zero of=/path/to/share/ddtest bs=1024k count=16000

Code:
dd if=/path/to/share/ddtest of=/dev/null bs=1024k count=16000


I restarted before running the read test to clear ARC/L2ARC (does L2ARC clear after reboot?)

The results were 570MB/s write and 533MB/s read, so the problem doesn't seem to be the raw I/O.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
smb testing – A default SMB share is giving me 20MB/s on reads and writes in macOS... so terrible as was my previous experience with Samba and Apple. I'm sure I could improve this by messing with the config (SMB3 is probably the main culprit), but I don't have time to do that at the moment.

The primary culprit for this performance issue is SMB signing. There are other issues with SMB and macs.. Until Apple defaults to it on their boxes, we are staying away.

I was planning to upgrade to FreeNAS 11 once U2 comes out, but maybe I'll go for U1.

The SFP+ card was in the back of my mind. I'm still waiting to get an official transceiver support list from IXSystems (I pulled the card out and couldn't even figure out what it is). I'm currently using a passive copper cable I already had which appeared to work fine, but who knows. Since the write speeds are good, I'm hesitant to blame the NIC, but I suppose TX traffic could suffer while RX is unaffected if SFP+ isn't happy. Not an issue I've had before though.

The interface name should show up in dmesg, that should tell you the manufacturer of the card.


If you're experiencing the same issue we saw, then it's not the hardware per se. It's the specific combination of kernel and network driver versions that are in FreeNAS. And yes, we were able to write faster than we were able to read. And the performance numbers we observed were nearly identical to what you posted.
We had it narrowed down to buffer handling in the IP stack, but abandoned our search when FreeBSD 11 cleared the problem.

It nearly drove us to abandon FreeNAS for a newer FreeBSD kernel until FreeNAS10/11 incorporated the FreeBSD 11 kernel..

Anyway, it's too bad you don't have a way to test 11. It did fix the problem for us.
 

alpaca

Dabbler
Joined
Jul 24, 2014
Messages
24
I am actually running into this as well, on Freenas 11. Totally different hardware setup with Xeon-D and striped SSDs, running over 10gbe, but similar AFP issue. Mac clients are saturating 10gbe on the write side but getting roughly 1/4 the performance on the read side, working both cases with large photo or video files. Windows clients via SMB saturate the network in both directions. Definitely smells like something specific with AFP or a tricky network/driver issue.

For comparisons sake, I installed Ubuntu 16.04 on same exact server and imported the ZFS pool. Compiled the latest netatalk and am able to saturate 10gbe with both reads and writes.

It seems that Freenas 11 has netatalk/AFP 3.1.10 and tested version on linux is 3.1.11.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
I am actually running into this as well, on Freenas 11. Totally different hardware setup with Xeon-D and striped SSDs, running over 10gbe, but similar AFP issue. Mac clients are saturating 10gbe on the write side but getting roughly 1/4 the performance on the read side, working both cases with large photo or video files. Windows clients via SMB saturate the network in both directions. Definitely smells like something specific with AFP or a tricky network/driver issue.

For comparisons sake, I installed Ubuntu 16.04 on same exact server and imported the ZFS pool. Compiled the latest netatalk and am able to saturate 10gbe with both reads and writes.

It seems that Freenas 11 has netatalk/AFP 3.1.10 and tested version on linux is 3.1.11.

Your Macs are connected at 10G as well? I only have 3 macs in my office lab network and all 3 can read and write at wire speed simultaneously to a 10G connected freenas. None of our end user systems have 10G, just the clusters so I don't have any macs with 10g cards to test.
 
Last edited:

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
The primary culprit for this performance issue is SMB signing. There are other issues with SMB and macs.. Until Apple defaults to it on their boxes, we are staying away.



The interface name should show up in dmesg, that should tell you the manufacturer of the card.


If you're experiencing the same issue we saw, then it's not the hardware per se. It's the specific combination of kernel and network driver versions that are in FreeNAS. And yes, we were able to write faster than we were able to read. And the performance numbers we observed were nearly identical to what you posted.
We had it narrowed down to buffer handling in the IP stack, but abandoned our search when FreeBSD 11 cleared the problem.

It nearly drove us to abandon FreeNAS for a newer FreeBSD kernel until FreeNAS10/11 incorporated the FreeBSD 11 kernel..

Anyway, it's too bad you don't have a way to test 11. It did fix the problem for us.

How long have you been running 11? Any issues at all? I usually wait for 2 revisions before I upgrade any OS.
 

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
The primary culprit for this performance issue is SMB signing. There are other issues with SMB and macs.. Until Apple defaults to it on their boxes, we are staying away.



The interface name should show up in dmesg, that should tell you the manufacturer of the card.


If you're experiencing the same issue we saw, then it's not the hardware per se. It's the specific combination of kernel and network driver versions that are in FreeNAS. And yes, we were able to write faster than we were able to read. And the performance numbers we observed were nearly identical to what you posted.
We had it narrowed down to buffer handling in the IP stack, but abandoned our search when FreeBSD 11 cleared the problem.

It nearly drove us to abandon FreeNAS for a newer FreeBSD kernel until FreeNAS10/11 incorporated the FreeBSD 11 kernel..

Anyway, it's too bad you don't have a way to test 11. It did fix the problem for us.

The SFP+ adapter is a Chelsio T520-SO. I had looked at dmesg before, but didn't' see it because I wasn't familiar with the brand.
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
How long have you been running 11? Any issues at all? I usually wait for 2 revisions before I upgrade any OS.

So far there are some cosmetic issues that mostly revolve around collectd and stats reporting. The RCs were somewhat unstable with SolarFlare cards, so we're holding back on upgrading our SF based boxes until our lab box runs for a while longer. But all of our intel boxes are upgraded and key functionality has worked fine for us since the release. (NFS, AFP, etc)
 
Last edited:

alpaca

Dabbler
Joined
Jul 24, 2014
Messages
24
Your Macs are connected at 10G as well? I only have 3 macs in my office lab network and all 3 can read and write at wire speed simultaneously to a 10G connected freenas. None of our end user systems have 10G, just the clusters so I don't have any macs with 10g cards to test.

Yes the Macs are connected via 10gbe. Single client from local nvme to Freenas maxes out 10gbe on writes and 1/4 on reads. Even with 3 clients, still saturating network on writes at about 875-950 MB/s combined, the combined read throughput is only about 300 MB/s with each client seeing 100-125MB/s.
 

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
Yes the Macs are connected via 10gbe. Single client from local nvme to Freenas maxes out 10gbe on writes and 1/4 on reads. Even with 3 clients, still saturating network on writes at about 875-950 MB/s combined, the combined read throughput is only about 300 MB/s with each client seeing 100-125MB/s.

What NIC are you using on the FreeNAS box? Have you looked into firmware/driver updates from the manufacturer? I'm looking into this on my end tomorrow. I solved a performance/stability issue with a Mellanox card a while back by installing the proprietary driver and updating the firmware.
 

alpaca

Dabbler
Joined
Jul 24, 2014
Messages
24
What NIC are you using on the FreeNAS box? Have you looked into firmware/driver updates from the manufacturer? I'm looking into this on my end tomorrow. I solved a performance/stability issue with a Mellanox card a while back by installing the proprietary driver and updating the firmware.
The xeon-d onboard nics which I believe are the Intel x552/557, going to explore if this is a driver issue some more too.
 

jclende

Dabbler
Joined
Apr 24, 2017
Messages
14
Hey @alpaca, @[URL='https://forums.freenas.org/index.php?members/c32767a.27144/']c32767a[/URL] quick question, are you using flow control on your 10GbE line? I have different system (Linux) running Netatalk, and I have a catch 22 where read speeds are bottlenecked at 1Gb/s on the 10GbE when flow control is on and when flow control is off, the appledouble database becomes corrupted and the system eventually hangs (like it's being DDOS'd).

Not sure if that has any bearing on this issue, but just curious about your flow control config...

I am doing work on the FreeNAS Mini XL on Thursday (RAM upgrade, NIC firmware updates, compliant transceiver modules, etc), so I'll post about that progress later this week or early next week.
 
Last edited:
Status
Not open for further replies.
Top