Transfer speed dips to zero and returns back to normal - abnormal fluctuation speeds

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
Hi all,

I have found some information about this, but it hasn't solved my problems.

My freenas mainly uses SMB/CIFS where I transfer files to and from my workstation using 10GbE base-T. Downloading doesn't seem to be a problem, however, when I upload large files to my shares, the transfer fluctuates from 300MB/s down to zero. It will stay at zero transfer speed for a duration of 20-30 seconds before it returns back to normal transfer speeds.

Just recently, a network error pops up telling me that the network is preventing me from copying the file (ERROR: 0x8007003B). If I ignore or retry, it will continue to upload. But will occur again after the file is uploaded. I did a search on this and it says to turn off my firewall. This is something I prefer not to do and don't think it's the solution I'm looking for.

I have used this same build for quite some time and it's only recently that this problem is occurring. I changed NICs on my workstation from X540s to X772s and the problem still exists. I do not think it's a problem with my NICs or the NICs on my freenas box. Has anyone else experienced this issue?
  • Motherboard make and model: Supermicro X10SDV-4CTLN2F
  • CPU make and model: Xeon D-1521
  • RAM quantity: 32GB DDR4 ECC REG
  • Hard drives: 2.5" 4TB Seagate x6 in RAID6 and a USB3 16GB Boot drive
  • Hard disk controllers: Onboard sata3 ports
  • Network cards: Onboard NICs (I believe these are X550s)
Thanks!
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
2.5 inch 4tb seagates? which model are those drives? Most drives in the 2.5 form factor above 1TB are SMR...if you have smr drives there's your writing problem right there..without more information though it's only a WAG.
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
Regardless of them being SMR or not, this problem didn't exist before.
After a move, I reinstalled freenas, upgraded my computer to a workstation and use direct connect rather than a 10G switch.
I think it's something to do with the SAMBA settings, or perhaps, not enough RAM?
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
ok so you reinstalled FreeNAS. Which version were you running before and what version are you running now? did you backup your config and reuse it or start from ground zero on the freenas?
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
ok so you reinstalled FreeNAS. Which version were you running before and what version are you running now? did you backup your config and reuse it or start from ground zero on the freenas?
also you siad you changed your client...so let's see the hardware specs and what version of windows youa re running there. You also mentioned you went direct 10G tot he NAs itself. YOu ahve made a ton of changes all at once.... One thing i would try immediately is disabling UAC. When network errors come up with win10 that's the first thing i usually do...
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
It's not a windows thing, and probably not UAC.
I'm replicating across a 10G link and the slow down occurs here too. ZFS replication is using SSH to transfer the files?
Could it be the hardware? I'm getting an error message during replication:

(adaX....) READ_FPDA_QUEUED. ACB: 60 00 88.... CAM status : Uncorrectable parity/CRC Error.

I did a search and found that this occurs when the cable or backplane has a problem. But what doesn't make sense is that transfer speeds fluctuate from full speed to zero because of this. I will take a look at the cabling once my replication is finished.
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
your OP says: My freenas mainly uses SMB/CIFS where I transfer files to and from my workstation using 10GbE base-T. Downloading doesn't seem to be a problem, however, when I upload large files to my shares, the transfer fluctuates from 300MB/s down to zero. It will stay at zero transfer speed for a duration of 20-30 seconds before it returns back to normal transfer speeds.

The error you gave in the next paragraph gives a windows network transfer error...that's what we had to go on...:)

With your latest update a crc error WILL cause this symptom as well...if it is getting chechsum errors it means the system is doing extra work to try to ensure the data is being written correctly...and this wil cuase severe performance issues youa re seeing. If you are having those kind of errors stop replicating...stop transferring data..shutdown the machine and check everything...you are at high risk to be corrupting your data on the fly with no recourse.
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
I've found out something in the reports. Drive ada4 is having some big performance issues.

- Disk Busy
- Latency
- Disk Operations

All three of these are considerably different from the other 5 drives. I believe it's a problem with the drive, the backplane or the sata cables.

I'll stop the replication process and update you soon.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Please post the model numbers of the drives.

With only a few exceptions that usually involve SAS, 15mm height, or both, large capacity drives in 2.5" are expected to be SMR, which will manifest itself in a number of ugly ways.
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
I dont think the drive is the problem. I think it's the cabling or the backplane. I replaced the HDD with an identical drive and am getting the same issue.

My NAS chassis was purchased from China and the backplane had problems previously (a 4 rear pin connector broke) and was replaced. It's either this or the sata cables which are from supermicro... Or the sata port on the MB, which I doubt even more.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
The point of asking the model is to identify whether it's SMR - in which case this issue is (depressingly) expected - or not, in which case it may be worth playing with the chassis/cables/etc.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
I'm afraid not - it's a hardware issue. Nothing to do apart from replace them with CMR drives, although having a good complain to Seagate about how they don't work in a NAS and will they replace them with a more appropriate CMR model is worth a try.
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
I replicated my data over to 8 x 1.2TB SAS2 drives (HUC101212CSS600) which are definitely CMR. The problem still exists.
The problem doesn't seem to be because of SMR. I even replaced my chassis and am using an LSI SAS3 HBA. The transfer speed when writing to the drives goes to zero every few seconds.

I was using Seagate Barracudas before but after the switch to HGST 2.5" SAS drives, my whole room is at least 3 degrees warmers because of the spinning disks. There's something to be said about power saving and heat with large capacity SMR drives.
 
Last edited:

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
are you still getting the network error? Something is still amiss..now that you have the SMR problem out of the way....you made several changes all at once. Since you reinstalled freenas you cannot go back there....
the error you mentioned in your OP IS a Windows thing..and one that's widely known about. If you have not disabled UAC please do so...also put the switch back into the loop. let's try those things.

yes the drives you are using now are 10k drives..they do heat up a bit. Instead of large capacity SMr drives which are NOT compatible with ZFS use some large capacity SAS 2tb 7.2k 2.5 drives. Those will be CMR drives.

also is your windows 10 up2date? if you are not running at least 1903 and have that updated you can hit this error. If you are running a/v try removing it as a test as well. This error can also be caused by a media mismatch. You mentioned you are going direct to the NAS..put the switch back into the mix. also have you replaced your 10G cables...are they fiber transceivers or copper?

also if you want us to help stop saying no it's not that immediately....
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
something else...goto file explorer. if it complains about file and printer sharing/network discovery being off turn it on...your machine may have defautled to your internal network being a public network..you will need to change it back to private.
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
the error you mentioned in your OP IS a Windows thing..and one that's widely known about. If you have not disabled UAC please do so...also put the switch back into the loop. let's try those things.
So I made a completely new dataset on the same pool. The only difference being deduplication and user access. I'm getting the expected speeds for reads and writes with this pool. I need some more time to figure out what other differences there might be with these datasets. Will share what I learn shortly.

also if you want us to help stop saying no it's not that immediately....
You're absolutely right. Sorry about that.
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
ah you didn't meantion dedup. If you are using dedup you need no less than 128 gigs of ram, probably even more IMO...32 simply isn't enough. I am not sure 128 gigs would be enough either....so i would export the data from that dataset...delete the deduped dataset...recreate it and do not use dedupe..and your user accesses should not have any effect on your transfer speeds.
 

Love4Storage

Dabbler
Joined
Nov 6, 2020
Messages
35
What if I upgraded to 128GB... Will this speed up the transfer speeds? This would save me time and I have a lot of RAM left over.
Also... Does dedup make the transfer speeds dip like this? It makes sense because I think before I turned dedup on the speeds were normal.

It maybe time to return to my SMR drives and just use them without dedup. They performed adequately with a rare timeout happening once in a blue moon.
 
Top