nVidia MCP55 Network Cards

Status
Not open for further replies.

fordero

Cadet
Joined
Dec 20, 2017
Messages
1
Hi - I'm really hoping someone maybe able to shed some light on a network issue I'm facing with my FreeNAS-9.10.2-U2 (e1497f2) server. The server is a DELL Server that contains two nVidia MCP55 network cards. This server is my datastore for my ESXi infrastructure.

The cards are configured as follows:

NFE0 - 192.168.0.21 - Management Network
NFE1 - 192.168.10.21 - Data Network (NFS)

The issue I am facing is that during a period of large data transfers the NFE1 network card stops responding. It can ping itself on 192.168.10.21 (assumed internal routing at this point) but no other machine on the network. Likewise no other machine on the .10.x network can ping 192.168.10.21.

If I restart the netif service the NFE1 card comes back to life. What's interesting is that if you tcpdump -i nfe1 during the crash / drop you can see traffic such as ARP requests and incoming NFS requests but to all ESXi hosts, vCenter and all other machines on the network the interface is down.

At this point I can only assume that the NIC / Driver is crashing. The net result is that I've lost VMs on some occasions.

Has anyone seen this issue before or knows of a way to limit the traffic on the card to prevent the crash? I've read in multiple places that the cards should be using the forcedeth driver?

This is the dmesg output:

nfe0: <NVIDIA nForce MCP55 Networking Adapter> port 0x3088-0x308f mem 0xc8045000-0xc8045fff,0xc8041800-0xc80418ff,0xc8041400-0xc804140f at device 8.0 on pci0
miibus0: <MII bus> on nfe0
e1000phy0: <Marvell 88E1116 Gigabit PHY> PHY 1 on miibus0
e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
nfe0: Using defaults for TSO: 65518/35/2048
nfe0: Ethernet address: 6c:f0:49:4e:43:62
nfe1: <NVIDIA nForce MCP55 Networking Adapter> port 0x3090-0x3097 mem 0xc8047000-0xc8047fff,0xc8046000-0xc80460ff,0xc8041c00-0xc8041c0f at device 9.0 on pci0
miibus1: <MII bus> on nfe1
e1000phy1: <Marvell 88E1116 Gigabit PHY> PHY 2 on miibus1
e1000phy1: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
nfe1: Using defaults for TSO: 65518/35/2048
nfe1: Ethernet address: 6c:f0:49:4e:43:63

Really appreciate any assistance as this is causing a lot of grief each time the internet *appears* to crash or drop from the network.

Any ideas?

Thanks for your time :)
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Has anyone seen this issue before or knows of a way to limit the traffic on the card to prevent the crash?
Literally anyone with a passing familiarity with Nvidia NICs. Or really anything besides their graphics cards. I'm not sure what kind of moron at Dell had that brilliant idea, but you're the one who ends up suffering.

Get an Intel NIC and don't look back.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Hi - I'm really hoping someone maybe able to shed some light on a network issue I'm facing with my FreeNAS-9.10.2-U2 (e1497f2) server. The server is a DELL Server that contains two nVidia MCP55 network cards. This server is my datastore for my ESXi infrastructure.

The cards are configured as follows:

NFE0 - 192.168.0.21 - Management Network
NFE1 - 192.168.10.21 - Data Network (NFS)

The issue I am facing is that during a period of large data transfers the NFE1 network card stops responding. It can ping itself on 192.168.10.21 (assumed internal routing at this point) but no other machine on the network. Likewise no other machine on the .10.x network can ping 192.168.10.21.

If I restart the netif service the NFE1 card comes back to life. What's interesting is that if you tcpdump -i nfe1 during the crash / drop you can see traffic such as ARP requests and incoming NFS requests but to all ESXi hosts, vCenter and all other machines on the network the interface is down.

At this point I can only assume that the NIC / Driver is crashing. The net result is that I've lost VMs on some occasions.

Has anyone seen this issue before or knows of a way to limit the traffic on the card to prevent the crash? I've read in multiple places that the cards should be using the forcedeth driver?

This is the dmesg output:

nfe0: <NVIDIA nForce MCP55 Networking Adapter> port 0x3088-0x308f mem 0xc8045000-0xc8045fff,0xc8041800-0xc80418ff,0xc8041400-0xc804140f at device 8.0 on pci0
miibus0: <MII bus> on nfe0
e1000phy0: <Marvell 88E1116 Gigabit PHY> PHY 1 on miibus0
e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
nfe0: Using defaults for TSO: 65518/35/2048
nfe0: Ethernet address: 6c:f0:49:4e:43:62
nfe1: <NVIDIA nForce MCP55 Networking Adapter> port 0x3090-0x3097 mem 0xc8047000-0xc8047fff,0xc8046000-0xc80460ff,0xc8041c00-0xc8041c0f at device 9.0 on pci0
miibus1: <MII bus> on nfe1
e1000phy1: <Marvell 88E1116 Gigabit PHY> PHY 2 on miibus1
e1000phy1: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
nfe1: Using defaults for TSO: 65518/35/2048
nfe1: Ethernet address: 6c:f0:49:4e:43:63

Really appreciate any assistance as this is causing a lot of grief each time the internet *appears* to crash or drop from the network.

Any ideas?

Thanks for your time :)
What model Dell system is that? I want to be sure I never get one.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
What model Dell system is that? I want to be sure I never get one.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
It has to be a Core 2 or older, since that's the last time Intel allowed for third-party chipsets.
 
Status
Not open for further replies.
Top