Need NFS Advice

Joined
May 2, 2017
Messages
211
Good day,

Lately, I've been having real trouble with copying files to my TrueNAS shares. I've read all about the usual troubleshooting paths (iperf, network cards), but this seems different.

I have Linux mint mounting NFS shares created on TrueNAS.

Lately, file copies seem to stall and I don't know if it's TrueNAS or Mint causing the problem. If I watch with System Monitor, I can see network transfers occurring, but the progress stops for sometimes long periods. It seems to go in bursts, sometimes short, sometimes long... I might get 10-15 Mbps for 10 seconds or 30, then it will drop to practically nothing for random periods. Sometimes the inactive transfer can last minutes. Eventually, it completes, but just now a 4GB file took 1 1/2 hours to copy to the shared location. It's not the 10-15 Mbps transfer speeds at issue, it's the endless stopping and going of the transfer that makes it take forever.

There are no errors being reported by anything... TrueNAS, the pfSense firewall, Linux Mint... But this is maddening, and the usability is unbearable.

Do any of you have an idea what could cause this behavior?

Thanks,
Steve
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
Are you using the Realtek NIC? Then that is 99.999% your problem.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Yeah, that'd be typical of a Realtek NIC.

Using a consumer grade mainboard is a bad idea for FreeNAS, and one of the most significant reasons is because they sometimes come saddled with a subpar ethernet controller that came from some Shenzhen back alley market that made a poor quality knockoff of a crappy Realtek chipset. The better quality consumer boards come with a legitimate crappy Realtek chipset. The problem in both cases is that these chipsets are just good enough to pass off as "wired ethernet" on a feature checklist, but generally these are only good enough to do some very casual web browsing from Windows that isn't super-demanding.

By comparison, your NAS may be expected to pass traffic at peak speeds for hours on end, which means that every packet, every interrupt, every interaction with the chipset has to be perfect, all the time, every time.

If the Linux box (or other client) ALSO has a Realtek, this just magnifies the potential issue.

Fortunately, you do not need to spend any money to test the theory. Load up iperf3 on both machines. Run server mode on one, client on the other, and CHECK.
 
Joined
May 2, 2017
Messages
211
I understand the issue with Realtek cards not being great. I'm not seeing how that can be the issue though.

This NAS is for my home, so from a TrueNAS standpoint, it IS only casually web browsing. Every so often I need to copy some large files to the NAS for Plex, like my DVD rips, and this is where the issue happens. If it's the network card, it wouldn't likely only be an issue with NFS shares mapped on the laptops, etc...

This TrueNAS box with that Realtek NIC routinely does backups directly to offsite storage like BackBlaze, and it can happily pump gigs of data up the pipe for days at a time if I do a full backup. It doesn't drop there, so why would it crash receiving a file from a laptop? The common thread seems to be NFS here. My share mappings have had the same NFS options for years in FSTAB... they are...

nfs rsize=8192,wsize=8192,timeo=14,intr,tcp

As a test, I read up on NFS mapping options and changed a couple shares to...

nfs hard,noatime,timeo=150,retrans=3,rsize=131072,wsize=131072,tcp

After a remount of those shares, the problem seems to have subsided a bit. I just copied numerous files (totaling 70 GB) over and not one lockup. So I'm clearly leaning toward this being an NFS issue, which begs the question... What are good standard options for mounting NFS shared on TrueNAS? I can tweak this further and see if I get any improvement.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
I don't have any experience with Realtek NIC but It seem to be very commonly linked to stuff not just working right.
Or I should probably say driver, hardware is probably fine enough since they are used in the millions, on Windows that is.

I'm not saying it can't be NFS but the amount of ppl not having problem using NFS vs the reported Realtek issues is a lot.
It might be possible to get NFS to a point where it play nice with Realtek for you and that would fix the current problem.

Do you have another NIC you can try and see if it have the same problem?
 
Joined
May 2, 2017
Messages
211
I don't have any experience with Realtek NIC but It seem to be very commonly linked to stuff not just working right.
Or I should probably say driver, hardware is probably fine enough since they are used in the millions, on Windows that is.

I'm not saying it can't be NFS but the amount of ppl not having problem using NFS vs the reported Realtek issues is a lot.
It might be possible to get NFS to a point where it play nice with Realtek for you and that would fix the current problem.

Do you have another NIC you can try and see if it have the same problem?

Not right at the moment. I do plan to move this TrueNAS out of its current case and put into a rack I've installed. It's on the to-do list, so I'll drop a different NIC in when I do that. Until then, I'm dealing with this issue as best I can.
 

ro55_mo

Dabbler
Joined
Feb 3, 2018
Messages
16
Just my two pence. I would recommend getting an Intel NIC for your NAS and PC as other posters have suggested. They are not a great deal of money.

I have dual bonded Intel NICs on both my TrueNAS and PC (yes I know this does not mean I get 2GBs speeds). The reason I reply is I am also using Linux Mint with an NFS share. My TrueNAS also has 32GB of (slower the OPs) RAM. I can copy a 7G iso either way in about 65 seconds by my stop watch.
2021-05-07_21-34.png


2021-05-07_21-37.png
 
Last edited:
Joined
May 2, 2017
Messages
211
Ordered an Intel NIC today. When I get it, I will switch to it...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I understand the issue with Realtek cards not being great. I'm not seeing how that can be the issue though.

This NAS is for my home, so from a TrueNAS standpoint, it IS only casually web browsing. Every so often I need to copy some large files to the NAS for Plex, like my DVD rips, and this is where the issue happens. If it's the network card, it wouldn't likely only be an issue with NFS shares mapped on the laptops, etc...

From a TrueNAS standpoint, "it IS only casually web browsing" is not true. Casual web browsing is where you occasionally send some packets back and forth to a web server, and stuff like retries get soaked up unnoticed in the grand scheme of things.

You are saying that when you are copying large files, that's a problem. That is NOT like casually web browsing. That is stressy network use.

I don't know about your laptop, but the one I'm writing this on is blessed with an Atheros ethernet, which has shown itself to be only capable of some moderate fraction of the gigabit it claims to support. So when you have a crappy ethernet chipset on both ends of the connection, and you are trying to stream packets at full speed, it doesn't seem like it should be shocking that any mishandling at all results in a lost packet, which can really ruin your day, performance-wise.

Complex interactions like this are hard to diagnose without getting a sniffer like Wireshark set up on a mirror port, to analyze what is actually going on over the wire. This will let you study the packet flow, and see what happens right before and during a network traffic "burp," which provides significant clues as to what is going on. Unfortunately, this setup (and the analysis) is beyond the capabilities of most forum posters, so we don't go there as a general suggestion. If you want to "see[] how that can be the issue", that's quite probably what you need to do, and that's probably just a starting point, because then you need to understand why the errant behaviour is happening, which could be a missed interrupt, a dropped frame, interrupt coalescing gone bad, a driver error, kernel mbuf allocation oddity, or any number of other likely and unlikely issues, which could really be happening on either end.

Rather, we know that it's quite well correlated that people showing up with Realteks on their FreeNAS who are experiencing mysterious performance problems quite often find them solved through the substitution of a quality ethernet interface, and the Intel CT's are both inexpensive and among the best gigabit available. If nothing else, the upside of going from ~700-800Mbps, which seems common for Realteks when they are working well, to ~930Mbps+ on a decent Intel card, is a good payoff in my opinion.
 
Joined
May 2, 2017
Messages
211
From a TrueNAS standpoint, "it IS only casually web browsing" is not true. Casual web browsing is where you occasionally send some packets back and forth to a web server, and stuff like retries get soaked up unnoticed in the grand scheme of things.

You are saying that when you are copying large files, that's a problem. That is NOT like casually web browsing. That is stressy network use.

I don't know about your laptop, but the one I'm writing this on is blessed with an Atheros ethernet, which has shown itself to be only capable of some moderate fraction of the gigabit it claims to support. So when you have a crappy ethernet chipset on both ends of the connection, and you are trying to stream packets at full speed, it doesn't seem like it should be shocking that any mishandling at all results in a lost packet, which can really ruin your day, performance-wise.

Complex interactions like this are hard to diagnose without getting a sniffer like Wireshark set up on a mirror port, to analyze what is actually going on over the wire. This will let you study the packet flow, and see what happens right before and during a network traffic "burp," which provides significant clues as to what is going on. Unfortunately, this setup (and the analysis) is beyond the capabilities of most forum posters, so we don't go there as a general suggestion. If you want to "see[] how that can be the issue", that's quite probably what you need to do, and that's probably just a starting point, because then you need to understand why the errant behaviour is happening, which could be a missed interrupt, a dropped frame, interrupt coalescing gone bad, a driver error, kernel mbuf allocation oddity, or any number of other likely and unlikely issues, which could really be happening on either end.

Rather, we know that it's quite well correlated that people showing up with Realteks on their FreeNAS who are experiencing mysterious performance problems quite often find them solved through the substitution of a quality ethernet interface, and the Intel CT's are both inexpensive and among the best gigabit available. If nothing else, the upside of going from ~700-800Mbps, which seems common for Realteks when they are working well, to ~930Mbps+ on a decent Intel card, is a good payoff in my opinion.

I don't doubt what you say, so I ordered another Intel card to try. When I get around to taking things apart to install and configure it, I'll let you all know what happens.

Thanks...
 
Joined
May 2, 2017
Messages
211
Just as an update. I've been running the new Intel NIC for a month, and it behaves exaclty the same way. And the bug I filed fell into the category of... "It works for everyone else, so it must be you".

I just had a 3.8 GB file copy stall for two hours, with TrueNAS disk thrashing about endlessly the whole time. But hey, it works for the developers, so...

 
Joined
May 2, 2017
Messages
211
So I disable SYNC on the dataset and the problem vanishes...

Disks SyncOFF.png


In the above image, you see the disk activity with SYNC ON as the ~75% disk usage. This was from copying a 3.8 GB file which stalls and fails after a great deal of time. In the space that follows I copied two files of 3.5 and 4.5 GB with SYNC OFF and you see the disk usage doesn't peak above 20%, and the copies took a few minutes.
 
Last edited:
Top