x520 10Gbe NIC won't connect in specific PCI slot

dealy663

Dabbler
Joined
Dec 4, 2021
Messages
32
Hi

I've been running this x520 10Gbe nic for about a year now. It was originally in PCI slot 1 (highest speed PCI4 x16. Recently I installed a higher end GPU and put that into slot 1. While experimenting with PCI pasthrough for the GPU I had to move the 10Gbe nic don't to PCI slot 3 which on this mobo only allows link at 5Gb/s. This was fine but now that everything is sorted out I had planned to move the NIC to PCI slot 2 which should give it enough bandwidth for 10Gb/s again. However TrueNAS always seems to bring the NIC down when it is in this slot. I can't figure out what the problem is. It assigns an IP address and the console says that the web UI is at the IP addr assigned to the NIC. But there is no network available, nothing shows up for it with ip link. Here I've captured the output of dmesg | grep ixgbe when the NIC is working properly in slot 3:

root@TrueNAS[~]# dmesg | grep ixgbe
[ 8.242251] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 8.248427] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 8.254280] ixgbe 0000:04:00.0: enabling device (0000 -> 0002)
[ 8.473352] ixgbe 0000:04:00.0: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16 XDP Queue count = 0
[ 8.486830] ixgbe 0000:04:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x2 link at 0000:03:00.0 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link)
[ 8.510095] ixgbe 0000:04:00.0: MAC: 2, PHY: 14, SFP+: 3, PBA No: FFFFFF-0FF
[ 8.524346] ixgbe 0000:04:00.0: 80:61:5f:0c:de:59
[ 8.535208] ixgbe 0000:04:00.0: Intel(R) 10 Gigabit Network Connection
[ 8.559777] ixgbe 0000:04:00.0 enp4s0: renamed from eth0
[ 38.822805] ixgbe 0000:04:00.0: registered PHC device on enp4s0
[ 39.007253] ixgbe 0000:04:00.0 enp4s0: detected SFP+: 3
[ 39.147307] ixgbe 0000:04:00.0 enp4s0: NIC Link is Up 10 Gbps, Flow Control: RX/TX


And then here is the same output when it brings the NIC down when the NIC is in slot2:
[ 7.044422] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 7.050551] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 7.064593] ixgbe 0000:0d:00.0: enabling device (0000 -> 0002)
[ 7.251532] ixgbe 0000:0d:00.0: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16 XDP Queue count = 0
[ 7.251835] ixgbe 0000:0d:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 7.251916] ixgbe 0000:0d:00.0: MAC: 2, PHY: 14, SFP+: 3, PBA No: FFFFFF-0FF
[ 7.251917] ixgbe 0000:0d:00.0: 80:61:5f:0c:de:59
[ 7.253003] ixgbe 0000:0d:00.0: Intel(R) 10 Gigabit Network Connection
[ 7.304200] ixgbe 0000:0d:00.0 enp13s0: renamed from eth1
[ 37.789591] ixgbe 0000:0d:00.0: registered PHC device on enp13s0
[ 37.974168] ixgbe 0000:0d:00.0 enp13s0: detected SFP+: 3
[ 38.118202] ixgbe 0000:0d:00.0 enp13s0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 65.251730] ixgbe 0000:0d:00.0: removed PHC on enp13s0
[ 65.362349] ixgbe 0000:0d:00.0: complete

In the second log sample you can see the higher througput of the PCI2 slot and everyting looks good until it says: removed PHC on enp13s0. Everything looks fine from the console network configuration screens.

Any ideas or suggestions on how to further troubleshoot this?

Thanks, Derek
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
TrueNAS always seems to bring the NIC down when it is in this slot.
I have seen that some folks needed to add "up" into the Options of the NIC (if you can use ifconfig ixgbe up to get it to come up manually, that should work to have it do so on startup)
 

dealy663

Dabbler
Joined
Dec 4, 2021
Messages
32
after the system has finished booting the interface for enp13s0 doesn't seem to be available to ifconfig. It just responds with "error while gettig interface flags: No such device"

The device is obviously physically there with the output from dmesg and lspci shows it also.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Using slot numbering per ATX specification (7 at top, closest to back panel):
Slot #7 N/A
Slot #6 (PCIE1) Gen4 x16/x8 CPU
Slot #5 N/A
Slot #4 (PCIE2) Gen3 x1 PCH
Slot #3 (PCIE3) Gen4 x0/x8 CPU
Slot #2 (PCIE4) Gen3 x1 PCH
Slot #1 (PCIE5) Gen3 x4/x2 PCH

If I understand correctly your description and nomenclature, the NIC works correctly in the topmost Slot #6 or in x1 Slot #4 but not in Slot #3.
What's in Slot #6 when the NIC is in Slot #3? If Slot #6 if free, I suspect that automatic bifurcation is not working properly; you may need to set bifurcation manually in BIOS or somehow populate Slot #6.
 

dealy663

Dabbler
Joined
Dec 4, 2021
Messages
32
I didn't name the slots properly in my original description, I skipped over the short slots in positions 2 & 4, there are 5 on this mobo
The book labels them:
  1. PCIe1 Gen4 x16/8 full width (closest to back panel) - original location of NIC, now holds GPU
  2. PCIe2 Gen3 x1 partial width currently empty (original location of Serial I/O card)
  3. PCIe3 Gen4 x16/8 full width, new location of NIC, but it is deactivated after booting TrueNas
  4. PCIe4 Gen3 x1 partial width holds serial I/O card
  5. PCIe5 Gen3 x2 (knocked down because of I/O card in slots 2 or 4), When the NIC is in this position it operates at ~5Gbps
I haven't heard of bifurcation settings in BIOS, I'll have to reconfigure and see if I can find a setting for that.

What is PCH? I haven't seen any mention of that in the manual for the mobo, is it typically associated with only some slots? I notice in dmsg it is removed from the NIC and at that point it seems to be unavailable any more. Google says it is something like a platform controller hub, but I don't see anything in the mobo description that details which slots are GPU and which are PCH. The manual says PCIe3 supports a graphics card, so I would assume that it is a CPU port.
 

dealy663

Dabbler
Joined
Dec 4, 2021
Messages
32
Well oddly enough something made me decide to try and disable the onboard ethernet on the mobo, once I did this I was able to see the 10Gbe NIC after boot up. I am surprised that there can be so many issues related to which PCIe slots are in use. I may have to move my GPU out and swap it with the NIC, trying to get exactly the right alignment of the PCI devices and their groups.
 

asap2go

Patron
Joined
Jun 11, 2023
Messages
228
I didn't name the slots properly in my original description, I skipped over the short slots in positions 2 & 4, there are 5 on this mobo
The book labels them:
  1. PCIe1 Gen4 x16/8 full width (closest to back panel) - original location of NIC, now holds GPU
  2. PCIe2 Gen3 x1 partial width currently empty (original location of Serial I/O card)
  3. PCIe3 Gen4 x16/8 full width, new location of NIC, but it is deactivated after booting TrueNas
  4. PCIe4 Gen3 x1 partial width holds serial I/O card
  5. PCIe5 Gen3 x2 (knocked down because of I/O card in slots 2 or 4), When the NIC is in this position it operates at ~5Gbps
I haven't heard of bifurcation settings in BIOS, I'll have to reconfigure and see if I can find a setting for that.

What is PCH? I haven't seen any mention of that in the manual for the mobo, is it typically associated with only some slots? I notice in dmsg it is removed from the NIC and at that point it seems to be unavailable any more. Google says it is something like a platform controller hub, but I don't see anything in the mobo description that details which slots are GPU and which are PCH. The manual says PCIe3 supports a graphics card, so I would assume that it is a CPU port.
Bifurcation is the setting that changes the PCIe allocation of the slots.
While every slot is electrically wired as you listed above that does not mean that you can use all those electrical connections at the same time.
They are usually oversubscribed because users nearly never populate all the slots.

Usually if you populate the first and second x16 slots they both become x8.
But that might be different for your motherboard.
The setting to adjust the lane allocation on an AsRock Board is in the attachements.
 

Attachments

  • proxy-image.jpg
    proxy-image.jpg
    203.3 KB · Views: 95

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
"PCH" (Platform Controller Hub) is the chipset. Sorry for the unnecessary spilling of technical jargon.

If there's a GPU in the other x16 slot, the symptoms look like the GPU retains all 16 lanes for itself. Look for bifurcation settings (as exemplified by @asap2go ) and set it to x8x8.
 

dealy663

Dabbler
Joined
Dec 4, 2021
Messages
32
Yeah, I tried setting the PCI allocation to 8x8, but that didn't help. The only thing so far that made a difference was disabling the onboard ethernet.

On a similar note, I have another question. After getting my system up with the 10gb nic in the desired slot, I could no longer do passthrough of the GPU to my VM. It was complaining about the iommu group not being viable. I had made sure to passthrough both the video and audio portions of the 3060Ti GPU. In my previous configuration this was all that was necessary.

But now with the moving of the NIC to a highspeed slot things have changed. Is there any way to influence which PCI devices are assigned to which groups?
 
Top