Asymmetric bandwidth usage between FreeNAS servers

Status
Not open for further replies.

MMartinez

Cadet
Joined
Feb 15, 2018
Messages
4
Hi,

We have two Supermicro FreeNAS servers with same hardware (RAM, CPU, NIC). We use one to share NFS, iscsi, smb, webdav and the other just to keep replicas. We call them freenas-prod and freenas-bkp.

They work at 10GbE and we are quite satisfied with the performance and the ZFS replicas, but some days ago I realised that there was something wrong when I send information from freenas-bkp to freenas-prod.

Let me explain a little more:

If I do a replica in the usual way, that is, from freenas-prod to freenas-bkp, I see throughputs near 800Mbps, that’s what I consider normal. In fact I usually put a limit of 50K on the replication tasks to limit them to 400Mbps and avoid problems to our users.

Now, the strange thing is that if I send a replica from freenas-bkp to freenas-prod, then the throughput is only 250Mbps.

I’ve done many tests, with these results:

A) Copy a test file of 20GB using scp
  1. scp from freenas-prod to freenas-bkp: 118MB/s

  2. scp from freenas-bkp to freenas-prod: 33MB/s

  3. scp from client1(10G) to freenas-prod: 185.2MB/s

  4. scp from client1(10G) to freenas-bkp: 274MB/s

  5. scp from freenas-bkp to client1(10G): 118MB/s

  6. scp from freenas-prod to client1(10G): 122MB/s
These tests are done with the same test file and lz4 compression on the destination dataset (to reach better transmission rates).

You can see two problems with this test:
  • in (2) the throughput is 4 times less than in (1). It is not symmetric.

  • in (3) the throughput is far lower than in (4).
As you can see, freenas-prod shows less performance receiving data.

B) Test disks using: dd if=/dev/zero of=test.img bs=1M count=20000
  • On freenas-prod: 3265301760 bytes/sec
  • On freenas-bkp: 3215623724 bytes/sec
  • On client1: 5.5 GB/s
They show quite similar values on freenas-prod and freenas-bkp. Client1:/tmp is a Linux tmpfs.

Test (B) tells me that the disks/mem are working fine.

C) I was using a LACP bond on each freenas server, and I thought the problem could be on one wire or NIC, so I tried to disconnect one interface at a time to force the network traffic go on one interface (ix0) or the other one (ix1). I did this test with the four NIC with the same results (33MB/s on scp from freenas-bkp to freenas-prod).

This tells that is not the wire or the NIC.

D) I wanted to ensure that the network is fine so I did some tests with netcat (nc) and also with iperf.

On freenas-prod: nc -v -l 2222 > /dev/null
On freenas-bkp: dd if=/dev/zero bs=1024K count=10000 | nc -v freenas-prod 2222

Using iperf and nc I get the maximum of the 10GbE NIC, about 9Gbps. So the network works fine.

E) I was using two stacked switches so I thought it could be the stack. Then I disconnected at the same time two 10Gbe NIC on both freenas to ensure the communication was using only one switch. I did this test with both switches. Same results (33MB/s).

This tells that is not the switch stack.

F) As I’ve said, I was using a LACP bond on freenas-prod and freenas-bkp. I’ve removed the LAGG and now I’ only using one 10GbE interface on Freenas-prod (the one I consider is failing). But the results are the same (33MB/s).

So, it is not the LAGG.

G) I’ve changed freenas-prod to another 10G switch, with same results.

Until next week I can’t stop freenas-bkp to remove the LAGG and connect it to the other 10G switch but I think is not related to the switch or the LAGG.

I will also do a test using jumbo frames, I believe that it could improve a bit the results in general but it shouldn’t be the cause of a slow throughput just from freenas-bkp to freenas-prod.

Maybe there is something wrong with my configuration but I can’t find nothing strange on it.

Can you help me? Any idea is welcome!!

Regards,

Manuel Martínez
 
Last edited:

MMartinez

Cadet
Joined
Feb 15, 2018
Messages
4
More information:

I've recently updated freenas-prod to 11.1. Freenas-bkp is still in 9.10.

At the very begining I thought it was a problem related to 11.1 but, looking at stats graph's I saw that 3 month's ago I did a replica from freenas-bkp to freenas-prod and it showed the same 250Mbps limit. In fact I documented it, considering it as slow performance... but I thought freenas-prod could be very busy and I didn't pay atention to it.

Regards,

Manuel Martínez
 
Status
Not open for further replies.
Top