Fast Local Writes, Fast Network Reads, Slow Network Writes Regardless of Protocol

isomage

Cadet
Joined
Oct 26, 2019
Messages
3
I'm typically pretty good at locating the source of performance issues, but these slow network writes have me stumped (and has for awhile).

-> Local Writes are Very Fast (so it's unlikely the disks, controller, server I/O)
-> Network WRITES are Very Slow (consistently exactly 11MB/s), Regardless of Network Protocol (scp, NFS, smb, ftp).
-> Network READS are just fine (>90MB/s).

These three data points are very difficult to reconcile IMO. I should add that Network WRITES via the localhost loopback are just fine.

So, it has to be something to do with the network, but what network issue manifests only in writes versus reads irrespective of the network application used?

I'm not expecting a solution, just a critical evaluation of the above details and some direction of where to go or look next. The usual suspects all seem to be precluded by one of the above data points, and syslog is bingo (at least at default verbosity).

Thanks in Advance for anyones time and consideration,
-Michael



Scenario Details Hardware Details:

FreeNAS-11.2-U6 (Issue also was manifest with same hardware on 9.3)

Supermicro X8S Ti-F w/Xeon E5504 & 12GB ECC RAM
Western Digital 2TB "Blue" Drives qty.4 on Integrated Intel ICH10R SATA Controllers in RAIDZ2 with lz4

Again, Very Fast Local Writes seem to preclude the above ^^from having issues or being the culprit.

Network Details:
Using a single Integrated Intel 82574L Gigabit Ethernet. FreeNAS is electing to use the Intel Pro/1000 driver

Seems like this ^^ has to be the issue, but why would it care, know, or manifest in WRITES and not READS? I would expect a faulty NIC or network issue to manifest in both types of operations (unless it was related to the network application layer, which I've ruled out).
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
Are these WD20EZAZ with 256MB cache? If so, they are SMR.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
I'm typically pretty good at locating the source of performance issues, but these slow network writes have me stumped (and has for awhile).

-> Local Writes are Very Fast (so it's unlikely the disks, controller, server I/O)
-> Network WRITES are Very Slow (consistently exactly 11MB/s), Regardless of Network Protocol (scp, NFS, smb, ftp).
-> Network READS are just fine (>90MB/s).

These three data points are very difficult to reconcile IMO. I should add that Network WRITES via the localhost loopback are just fine.

So, it has to be something to do with the network, but what network issue manifests only in writes versus reads irrespective of the network application used?

I'm not expecting a solution, just a critical evaluation of the above details and some direction of where to go or look next. The usual suspects all seem to be precluded by one of the above data points, and syslog is bingo (at least at default verbosity).

Thanks in Advance for anyones time and consideration,
-Michael



Scenario Details Hardware Details:

FreeNAS-11.2-U6 (Issue also was manifest with same hardware on 9.3)

Supermicro X8S Ti-F w/Xeon E5504 & 12GB ECC RAM
Western Digital 2TB "Blue" Drives qty.4 on Integrated Intel ICH10R SATA Controllers in RAIDZ2 with lz4

Again, Very Fast Local Writes seem to preclude the above ^^from having issues or being the culprit.

Network Details:
Using a single Integrated Intel 82574L Gigabit Ethernet. FreeNAS is electing to use the Intel Pro/1000 driver

Seems like this ^^ has to be the issue, but why would it care, know, or manifest in WRITES and not READS? I would expect a faulty NIC or network issue to manifest in both types of operations (unless it was related to the network application layer, which I've ruled out).
I'm only speculating but an incorrect receive buffer or related could cause an issue. Also quadruple check the cable and interface settings. Force 1gb full duplex etc..
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Have you tested the cable and connection path using iperf ? Let's take the storage out of the picture entirely.

Are these WD20EZAZ with 256MB cache? If so, they are SMR.

This is another possibility that is worth investigating if the iperf test shows good results
 

isomage

Cadet
Joined
Oct 26, 2019
Messages
3
I'm only speculating but an incorrect receive buffer or related could cause an issue. Also quadruple check the cable and interface settings. Force 1gb full duplex etc..



Finally had a moment to get back to this again, and found the issue. A partially bad network cable.

I was so focused on a FreeNAS setting that must be wrong that I was reticent to consider anything else...that is until you two (kdragon and Honeybadger) weighed in, so thank you very much for responding.

-Michael

Troubleshooting path after your feedback:

I plumbed a second physical network port (the motherboard had 2), and got full speed writes on it. Then, on a hunch, I replaced the network cable on the original network port and began to get full write speed on it again as well. So, bad cable (or cold contact...either way I tossed the suspect cable away). Apparently the issue was only with the receive (RX) pins, but not the transmit (TX) pins.

When something works for a long while (or in this case runs at full 1Gps speed) and then partially stops working (in this case suddenly begins *writing* at just 11MB/s, but still reading at full 90MB/s or more), the last thing you suspect is a physical layer issue because:

A) It worked fine before
&
B) It isn't a complete failure, as you might expect with broken hardware.
 

isomage

Cadet
Joined
Oct 26, 2019
Messages
3
I'm only speculating but an incorrect receive buffer or related could cause an issue. Also quadruple check the cable and interface settings. Force 1gb full duplex etc..

Finally had a moment to get back to this again, and found the issue. A partially bad network cable.

I was so focused on a FreeNAS setting that must be wrong that I was reticent to consider anything else...that is until you two (kdragon and Honeybadger) weighed in, so thank you very much for responding.

-Michael

Troubleshooting path after your feedback:

I plumbed a second physical network port (the motherboard had 2), and got full speed writes on it. Then, on a hunch, I replaced the network cable on the original network port and began to get full write speed on it again as well. So, bad cable (or cold contact...either way I tossed the suspect cable away). Apparently the issue was only with the receive (RX) pins, but not the transmit (TX) pins.

When something works for a long while (or in this case runs at full 1Gps speed) and then partially stops working (in this case suddenly begins *writing* at just 11MB/s, but still reading at full 90MB/s or more), the last thing you suspect is a physical layer issue because:

A) It worked fine before
&
B) It isn't a complete failure, as you might expect with broken hardware.
 
Top