LAGG with LACP

Status
Not open for further replies.

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
I am using FreeNAS 9.2.1.2.

I have the Intel x520-DA2 card in the server. I was trying to get LACP working on it. When I setup the lagg interface in FreeNAS, it seems like LACP is working correctly. The switch side shows that the trunk is active so the LACP protocol seems to be working correctly, but when I try and pass traffic (doing either tagged or untagged vlans) it will not work. If I remove the interfaces from the lagg and assign the IP on an individual interface it works great (both for tagged and untagged vlans).

Here is the ifconfig for the lagg0 interface:

[root@nas1] ~# ifconfig -v lagg0
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO>
ether 90:e2:ba:58:c0:1c
inet 10.7.10.70 netmask 0xffffff00 broadcast 10.7.10.255
nd6 options=9<PERFORMNUD,IFDISABLED>
media: Ethernet autoselect
status: active
groups: lagg
laggproto lacp lagghash l2,l3,l4
lag id: [(0000,00-00-00-00-00-00,0000,0000,0000),
(0000,00-00-00-00-00-00,0000,0000,0000)]
laggport: ix1 flags=18<COLLECTING,DISTRIBUTING> state=3D
[(8000,90-E2-BA-58-C0-1C,0160,8000,0008),
(0001,CC-4E-24-14-BA-F0,2712,0001,0182)]
laggport: ix0 flags=18<COLLECTING,DISTRIBUTING> state=3D
[(8000,90-E2-BA-58-C0-1C,0160,8000,0007),
(0001,CC-4E-24-14-BA-F0,2712,0001,0082)]



Here is the trunk status from my switch (Brocade):

Ports 1/3/2 2/3/2
Link_Status active active
port_state Forward Forward
LACP_Status ready ready

dmesg data for adapter:
ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15> port 0xf020-0xf03f mem 0xfb580000-0xfb5fffff,0xfb604000-0xfb607fff irq 58 at device 0.0 on pci132
ix0: Using MSIX interrupts with 9 vectors
ix0: Ethernet address: 90:e2:ba:58:c0:1c
ix0: PCI Express Bus: Speed 5.0GT/s Width x8


I currently have LAGG in failover mode just so I can have connection redundancy, but I would prefer to get LACP working and passing traffic as that is my preferred setup. Thanks,
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
I had trouble with LAG/LACP on 9.2.1.2 as well (worked fine before), i assumed it was due to the upgrade but maybe there is another issue at hand here...
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
I had trouble with LAG/LACP on 9.2.1.2 as well (worked fine before), i assumed it was due to the upgrade but maybe there is another issue at hand here...


What version did you upgrade from? Was is still in the 9.2.x versions or older? Just wondering if it might pay to downgrade to get it working for now.
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Well if you dont have a pool of data (pun intended) than go back to 9.2.0 as that seems to be the most reliable atm.
I went with every update from 9.1 and i *think* the LACP issue occured after 9.2.1; i had some other (possibly) unrelated issues so i am not entirely sure.
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
There are a lot of bug fixes between 9.2.0 and 9.2.1.2 (and at least one new bug maybe :cool: ). I may hold off to see what others have to say before I make that move, but I will keep the option open to downgrade to that version. Thanks.
 

dellpe

Cadet
Joined
Dec 17, 2013
Messages
4
I also have this problem.
FreeNAS 8.0.4
- Intel Gigabit ET Dual Port NIC
- lagg0 configured as LACP
- Onboard Marvel gigabit nic for admin

Server - Windows
- Intel Gigabit ET Dual Port NIC
- Teamed and configured for LACP
- Onboard Intel gigabit nic for Admin

Switch
- Dedicated Huawei S5700-28C-EI
- Two pairs currently configured as LACP ports
- One port vlaned for admin, rest vlaned for dedicated network for iSCSI

and my lagg0 is no active.Have you already solved this problem?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
LOL.. 8.0.4. I'd upgrade to 9.2+ before even engaging in a conversation in the forums. Much of what is offered for solutions aren't going to work on a 2 year old build of FreeNAS that isn't even based on the same FreeBSD version.
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
I still have not got this resolved. As a matter of fact, using iperf, scp, and NFS I am seeing low transfer numbers. The throughput is abysmal. I checked the pool performance and it is solid, just network is bad.
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
I still have not got this resolved. As a matter of fact, using iperf, scp, and NFS I am seeing low transfer numbers. The throughput is abysmal. I checked the pool performance and it is solid, just network is bad.


To illustrate my network performance on the 10G link. I setup a mfs mount on my two like FreeNAS servers and copied an ISO, server to server. From MFS mount to MFS mount I get 67.8MB/s

PCBSD10.0-RELEASE-x64-DVD-USB-latest.iso 100% 3659MB 67.8MB/s 00:54

Here is how I set the MFS on both servers:

cd /mnt
mkdir test
mdmfs -s 5g md /mnt/test/

Not seeing errors so I know it is a clean link:

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll
ix0 1500 <Link#7> 90:e2:ba:58:bf:90 8740548 0 0 8715399 0 0
ix1 1500 <Link#8> 90:e2:ba:58:bf:90 28119 0 0 1 0 0
lagg0 1500 <Link#11> 90:e2:ba:58:bf:90 8740567 0 0 8715407 0 0
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I just noticed you've never included your hardware specs.. care to share?
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
Good point cyberjock-

I got a pair of servers from ixsystems (36 Bay Hot-Swap SAS/SATA Drive Bays Servers):

CPU: Dual Socket E5-2620v2 (2.10Ghz)
RAM: 128GB (8x16GB DDR3 1600Mhz ECC/REG )
FreeNAS: 9.2.1.2
Disk Contoller: LSI 9207-8I
Spinning Iron: 14 x Seagate Constellation ES.3 2TB Disks SAS (ST2000NM0023)
SSDs: 2 x Seagate Pro 600 240GB (ST240FP0021)
Boot disk: InnoLite 32GB SATA drive
Network: Intel x520-DA2 10Gb using twin-axial cables
Switches: Pair of Brocade icx6610's stacked

Pool:

pool: storage
state: ONLINE
scan: scrub repaired 0 in 0h4m with 0 errors on Sun Mar 23 00:04:07 2014
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/774213f4-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/7792e245-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gptid/77e805b2-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/7837dcb3-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
gptid/788b6464-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/78dd078e-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
gptid/79373a52-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/7992ea56-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
mirror-4 ONLINE 0 0 0
gptid/79e80e64-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/7a3885a5-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
mirror-5 ONLINE 0 0 0
gptid/7a8ef322-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/7ae3211f-ab0d-11e3-afc5-002590e325aa ONLINE 0 0 0
mirror-7 ONLINE 0 0 0
gptid/9552dd28-ae16-11e3-afc5-002590e325aa ONLINE 0 0 0
gptid/95d3f560-ae16-11e3-afc5-002590e325aa ONLINE 0 0 0
logs
mirror-6 ONLINE 0 0 0
gpt/slog0 ONLINE 0 0 0
gpt/slog1 ONLINE 0 0 0
cache
gpt/cache0 ONLINE 0 0 0
gpt/cache1 ONLINE 0 0 0


I partitioned the SSDs to allow both SLOG and CACHE on them, plus left some un-provisioned space for ware leveling.

Here is how I setup the SSD:

gpart create -s gpt da28
gpart create -s gpt da29
gpart add -t freebsd-zfs -b 2048 -a 4k -l slog0 -s 20G da28
gpart add -t freebsd-zfs -b 2048 -a 4k -l slog1 -s 20G da29
gpart add -t freebsd-zfs -a 4k -l cache0 -s 180G da28
gpart add -t freebsd-zfs -a 4k -l cache1 -s 180G da29
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
Disk performance testing:

dd RandomWrite: dd if=/dev/random bs=4k count=2m of=./test.bin (36,603,560 bytes/sec)
dd Read: dd if=./test.bin bs=4k count=2m of=/dev/null (830,981,308 bytes/sec)
dd ReadWrite: dd if=./test.bin bs=4k count=2m of=./test2.bin (194,108,830 bytes/sec)
dd ZeroWrite: dd if=/dev/zero bs=4k count=2m of=./test3.bin (249,941,389 bytes/sec)
dd RandomWrite: dd if=/dev/random bs=64k count=128k of=./test.bin (40,440,159 bytes/sec)
dd Read: dd if=./test.bin bs=64k count=128k of=/dev/null (4,075,562,531 bytes/sec)
dd ReadWrite: dd if=./test.bin bs=64k count=128k of=./test2.bin (339,165,605 bytes/sec)
dd ZeroWrite: dd if=/dev/zero bs=64k count=128k of=./test3.bin (617,997,785 bytes/sec)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, doing the SLOG and L2ARC on the same SSD makes the SSD suck at both jobs. The L2ARC will only push data to a drive when the drive has been idle for so long.. and your SLOG is going to make sure the drive is almost never idle. So I'd definitely bail on that design. I kind of wondered what kind of weird setup you had going because FreeNAS doesn't give drives names like gpt/slog and gpt/cache. There's a reason why you aren't offered the option to split an SSD into an SLOG and an L2ARC. It's not a very smart idea though. I have no idea why you chose a block size of 2048 bytes for the SLOG, but 4k is the smallest block you can do, so you might be really screwing yourself there. I'd get rid of the L2ARC and SLOG and do separate devices(and set them up from the WebGUI and not the CLI).

Of course, that doesn't solve your networking problem. I'll say this about networking problems with LAGG/LACP though. 99% of the time it's an issue that isn't with FreeNAS but with the user. Often the hardware isn't actually compatible with LAGG/LACP because of user settings not being setup to match between FreeNAS and the switch, not updating the firmware on the switch, etc.

I have no idea what the intention of your server is, but if you are planning to run VM storage on your server you are better off with iSCSI and MPIO(which means you can ditch LACP/LAGG and avoid this whole problem).

I would have expected your last test to be faster. I do about 1GB/sec with a much less ideal design than yours. You also invalidated many of your tests because your test size wasn't big enough. Gotta be bigger than your system RAM or the system will cache the whole thing giving absurdly high numbers... dd read at 4GB/sec gives that one away. Also using /dev/random is pointless because its only able to generate about 80MB/sec or so best case(which you aren't best case), so you aren't testing your pool but just the speed of random data creation(big deal). Small block sizes also don't really reflect reality, so 4k tests are pointless. For a system with 128GB of RAM I wouldn't be doing any write or read test that wasn't at least 250GB. And I'm not sure what you are hoping to prove with the dd tests, but throughput isn't the most ideal for mirrors. It can get very high, but it's not necessarily the most ideal.
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
Cyberjock thanks for you points on all the other items. I shared my setup as that seems to be the expected norm, but I am going to just stick to the topic of the post. Once I resolve the network performance bits if I have issues related the rest of the setup, I will use your post as a good reference point to start looking at it.

For LACP setup the config is pretty simple on the switch side and this is not the only LACP setup I have on it. If you go back to my original post, you will notice I pointed out that the LACP stuff was correctly working. The switch reported that the links were active and correctly working for LACP. If you look at the lagg0 output on FreeNAS, it also appears to me that it is also reporting LACP is working correctly (if I think back, I do not recall if I have every done LAGG on FreeBSD. I have only done it with Solaris and Linux so I leave the judgement of others to say I interpreted the ifconfig output correctly on the server side). When I tried to pass traffic on the interface either tagged or untagged it did not work. If I dropped the LACP config on both sides, but keep the tagged or untagged settings the same it works. If the best advise I am going to get is that it is not FreeNAS, then I guess it is time to just install vanilla FreeBSD and go down that path were I can have better control of the network config and settings and get away from the web gui so I know how it is all setup and I can better play with it.

To add one other note, someone else I have been talking to mentioned that I might be having issues with the fact I am using twin-axial connections since I am using cables provided by the switch vendor. In theory it should not be an issue as the cables meet the specs as quoted by Intel. I am in the process of acquiring an Intel SFP+ for the network adapter and a SFP+ from the switch vendor and will try a fiber connection instead. This was more in reference to speed vs. traffic actually passing over the aggregated link. I hope to find out later this week if the speed part is resolved.

Thanks,
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
For LACP setup the config is pretty simple on the switch side and this is not the only LACP setup I have on it. If you go back to my original post, you will notice I pointed out that the LACP stuff was correctly working. The switch reported that the links were active and correctly working for LACP.

I'm sorry, but simple doesn't make it idiot-proof. Plenty of people have forgotten a checkbox on some other obscure page of their network switch and that was the root of their problems. Plenty of people have claimed it was "working correctly" before. And my answer is "if it was working correctly you woudn't be here.... so stop that". I also have no clue what the requirements for the switch to report that things are active and correctly. Some switches are so bad at identifying "correct"that a link to any NIC(even from another computer!) was enough for some hardware to report everything as active and correct. So I'm sorry, but I'm still a bit hesitant to jump to the conclusion that your settings or hardware isn't to blame.

This may sound rough, but LACP and LAGG isn't for people that are newbs to networking. I'm not saying you are or aren't in that category(I don't know). But statistically speaking network configuration errors are far more common than people will realize. Also, the number of people who have used other OSes where it would sometimes do things how you want and not the way things are supposed to work based on settings is quite high. Plenty of people put 2 NICs on the same LAN, give them independent IPs and expect behavior you shouldn't expect with Windows. FreeBSD will let you do things that are completely idiotic(whereas Windows will often reconfigure itself to do what it thinks you are wanting to do).
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
When I say it is simple, I mean it is simple. A quick Google search of "brocade icx6610 lacp" and I find this as the third result. I am telling you I had the config setup correctly. You are right, you do not know my background at all. I have been a Network Engineer for over 13 years and I have been working on Brocade/Foundry hardware for the past 7 years. I know how to setup LACP. It is working between my Cisco 6509 and my Brocade with no problem. As I mentioned before, I have other LACP groups setup that work and the switch reports the same operational status for them. So I can confidently say that LACP was setup and working correctly from the switches point of view.

So to add to my testing. I booted to a Live ISO of CentOS 6 and ran iperf between it and another server (no LAGG). It reported an average of about 4Gbps with multiple tests. When I use plain vanilla FreeBSD v10 (I did not have a 9.2 ISO at the time) the best I get is 3Gbps (no LAGG) so that is not that far off. With FreeNAS I was only able to get 1.6Gbps (LAGG in failover mode), but I have to disable LRO on the interfaces. When I enable LRO the best I get is 1.19Gbps. I know FreeBSD v10 != FreeNAS 9.2.1.2 so I am downloading an ISO of FreeBSD 9.2 and will install tomorrow to get a more apples to apples test. I ran out of time today to setup LAGG in either failover or LACP mode on v10, but will just wait to do it once v9.2 is installed instead.

Looking back at the config history for my switch, this is what the settings looked like when I was trying to get LACP working (just for your references).

interface ethernet 1/3/2
load-interval 30
speed-duplex 10G-full
link-aggregate configure key 10001
link-aggregate active
!
interface ethernet 2/3/2
load-interval 30
speed-duplex 10G-full
link-aggregate configure key 10001
link-aggregate active

load-interval 30: used to compute load statistics, including the input rate in bits and packets per second, the output rate in bits and packets per second, the load, and reliability. I set it to 30 seconds.
speed-duplex 10G-full: well that speaks for itself.
link-aggregate config key 10001: This sets group membership of the port
link-aggregate active: this sets LACP to be in active mode vs. passive or off.

As I said, it is not that hard to setup (the great thing about enterprise network switches).

I will follow up tomorrow after I get 9.2 installed in place of v10.

Don't take my replies as flippant or that I am ignoring you. I respect the answers you are giving me as they make the most sense, most of the time. I am not a "noob" to networking as I have made clear. I am rusty on the FreeBSD side as I have spent more recent time working with CentOS 5/6 and Solaris 10. However my first experience with "unix" was with FreeBSD 2.2.0 and BSDi. At my first job as a network guy we managed and used BSDi for our DNS servers, network proxy, and for a TFTP server with all our router/switch configs and code. Most all the engineers used FreeBSD as their desktop OS and so I did the same thing. I have been using it off and on since then (just mostly off more recently).

Thanks,
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
So I have vanilla FreeBSD 9.2 installed on the server right now. I setup LAGG and LACP. I attached a screenshot of my rc.conf with LACP setup since I had no other way to access the box but via IPMI. I talked to the my sales engineer from Brocade and he confirmed that the switch side is setup correctly. He also confirmed that the switch sees the LACP as up and working and it should be passing traffic, but I can not ping off the box. I ran out of free time today to really dig into any other troubleshooting as to what could be happening, but it is not working as expected. I may give up on the LACP bits and stick to fail-over mode.

I still have slow network performance. This is the best I can get with iperf server to server, same hardware on the same switch with one box running FreeNAS 9.2.1.2 and the other running FreeBSD 9.2.

[ 4] 0.0-20.0 sec 8.22 GBytes 3.53 Gbits/sec
[ 5] 0.0-20.0 sec 7.29 GBytes 3.13 Gbits/sec
[ 4] 0.0-20.0 sec 8.86 GBytes 3.80 Gbits/sec

[ 5] 0.0-20.0 sec 8.00 GBytes 3.43 Gbits/sec

I know FreeBSD should get better results then this, I am just at a loss on were to go from here now.
 

Attachments

  • config.png
    config.png
    40.7 KB · Views: 419

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Whats your speed without LACP?
4Gbit is slow even for a single 10G line... I'd try to get a single connection up first and then go about LAGG in a second phase.
 

freebsdsa

Dabbler
Joined
Feb 26, 2014
Messages
29
Oh no, that speed in my performance on single interface. LACP does not work, I have tested both single interface and it setup in failover in LAGG. They both offer up the same network performance. I just ran a full 12 scp's from a one server to the other pushing a DVD to devnull on the other server and it maxed out at 4.2Gbps. I fired up 12 iperf servers running on difference ports. Then fired off 12 clients each to a different server port and it maxed out at 3.5Gbps combined. I have done some tunning of some sysctl network options. Based what I know and this site here. This is what I have setup.

kern.ipc.maxsockbuf=16777216
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.mssdflt=1448
net.inet.tcp.experimental.initcwnd10=1
net.inet.tcp.sendbuf_inc=262144
net.inet.tcp.recvbuf_inc=262144

But the results seem to have been pretty much the same with and without the tunning.
 

Rand

Guru
Joined
Dec 30, 2013
Messages
906
Your card isn't located in a PCIe-1 4x slot is it? Just to be sure ;)
You need PCIe 1.0 x8 or 2.0 x4 or higher to get ~10GBit Transfer
 
Status
Not open for further replies.
Top