Chelsio T420 reports 'fw init error' during boot, works fine outside FreeBSD?

Status
Not open for further replies.

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
I'm setting up a temporary 2nd server to troubleshoot my primary NAS :p :rolleyes: . To do this I've kicked the system SSD and VGA card out of a spare Hyper-V server (it's the only other Supermicro/Xeon/ECC platform that I can easily free up for a few days). The server is ideal for FreeNAS - Supermicro X10, Xeon v4, NVMe, Chelsio T420, and importantly it's all known good and working. I added a clean SSD and iinstalled 11.1-U1, but FreeBSD reports an error trying to initialise the card during boot - it's recognised but won't present it as an NIC.

dmesg output:
Code:
# dmesg -a

FreeBSD 11.1-STABLE #0 r321665+4bd3ee42941(freenas/11.1-stable): Thu Jan 18 15:45:01 UTC 2018
root@gauntlet:/freenas-11-releng/freenas/_BE/objs/freenas-11-releng/freenas/_BE/os/sys/FreeNAS.amd64 amd64
CPU: Intel(R) Xeon(R) CPU E5-1680 v4 @ 3.40GHz (3400.07-MHz K8-class CPU)
real memory  = 139586437120 (133120 MB)
FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
		   --- lines skipped ---
acpi_syscontainer0: <System Container> on acpi0
pcib1: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pcib1: _OSC returned error 0x10
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 26 at device 1.0 on pci1
pci2: <ACPI PCI bus> on pcib2
t4nex0: <Chelsio T420-CR> mem 0xfb100000-0xfb13ffff,0xfa800000-0xfaffffff,0xfb244000-0xfb245fff irq 26 at device 0.4 on pci2
t4nex0: fw init failed: 5.
t4nex0: error during attach, adapter is now in recovery mode.
pci2:	  --- boot continues as normal with other PCIe devices ---

Things tried:

  • PCIe command line tools + check for kld related to card - collapsed as they look reasonable?
    # pciconf -lv

    t4iov0@pci0:1:0:0: class=0x020000 card=0x00001425 chip=0x40011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Ethernet Controller'
    class = network
    subclass = ethernet
    t4iov1@pci0:1:0:1: class=0x020000 card=0x00001425 chip=0x40011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Ethernet Controller'
    class = network
    subclass = ethernet
    t4iov2@pci0:1:0:2: class=0x020000 card=0x00001425 chip=0x40011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Ethernet Controller'
    class = network
    subclass = ethernet
    t4iov3@pci0:1:0:3: class=0x020000 card=0x00001425 chip=0x40011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Ethernet Controller'
    class = network
    subclass = ethernet
    t4nex0@pci0:1:0:4: class=0x020000 card=0x00001425 chip=0x44011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Ethernet Controller'
    class = network
    subclass = ethernet
    none74@pci0:1:0:5: class=0x010000 card=0x00001425 chip=0x45011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Storage Controller'
    class = mass storage
    subclass = SCSI
    none75@pci0:1:0:6: class=0x0c0400 card=0x00001425 chip=0x46011425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T420-CR Unified Wire Storage Controller'
    class = serial bus
    subclass = Fibre Channel
    none76@pci0:1:0:7: class=0x020000 card=0x00001425 chip=0x00001425 rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    class = network
    subclass = ethernet

    # lspci -v

    01:00.0 Ethernet controller: Chelsio Communications Inc T420-CR Unified Wire Ethernet Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 26
    Memory at fb200000 (64-bit, non-prefetchable)
    Memory at fb2cc000 (64-bit, non-prefetchable)
    Expansion ROM at fb000000 [disabled]
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/8 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=8 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.1 Ethernet controller: Chelsio Communications Inc T420-CR Unified Wire Ethernet Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 28
    Memory at fb1c0000 (64-bit, non-prefetchable)
    Memory at fb2ca000 (64-bit, non-prefetchable)
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/8 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=8 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.2 Ethernet controller: Chelsio Communications Inc T420-CR Unified Wire Ethernet Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 29
    Memory at fb180000 (64-bit, non-prefetchable)
    Memory at fb2c8000 (64-bit, non-prefetchable)
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/8 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=8 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.3 Ethernet controller: Chelsio Communications Inc T420-CR Unified Wire Ethernet Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 30
    Memory at fb140000 (64-bit, non-prefetchable)
    Memory at fb2c6000 (64-bit, non-prefetchable)
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/8 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=8 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.4 Ethernet controller: Chelsio Communications Inc T420-CR Unified Wire Ethernet Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 26
    Memory at fb100000 (64-bit, non-prefetchable)
    Memory at fa800000 (64-bit, non-prefetchable)
    Memory at fb2c4000 (64-bit, non-prefetchable)
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=128 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.5 SCSI storage controller: Chelsio Communications Inc T420-CR Unified Wire Storage Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 28
    Memory at fb0c0000 (64-bit, non-prefetchable)
    Memory at fb2c2000 (64-bit, non-prefetchable)
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=40 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.6 Fibre Channel: Chelsio Communications Inc T420-CR Unified Wire Storage Controller
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 29
    Memory at fb080000 (64-bit, non-prefetchable)
    Memory at fb2c0000 (64-bit, non-prefetchable)
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=40 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data

    01:00.7 Ethernet controller: Chelsio Communications Inc Device 0000
    Subsystem: Chelsio Communications Inc Device 0000
    Flags: bus master, fast devsel, latency 0, IRQ 255
    Capabilities: [40] Power Management version 3
    Capabilities: [48] MSI: Enable- Count=1/32 Maskable+ 64bit+
    Capabilities: [60] MSI-X: Enable- Count=32 Masked-
    Capabilities: [6c] Express Endpoint, MSI 00
    Capabilities: [a8] Vital Product Data


    # kldstat

    Id Refs Address Size Name
    1 0xffffffff82f8c000 d002 t3_tom.ko
    2 0xffffffff82f9a000 461a toecore.ko
    1 0xffffffff82f9f000 15e47 t4_tom.ko
    1 0xffffffff82fb5000 35a6 ums.ko
  • Moved card to other PCIe sockets (swapped with my HBA card which works fine, to eliminate PCIe bus/socket issues)
  • Reset Supermicro firmware (usual boot -> setup -> "reset optimal defaults")
  • Checked card's orom firmware is accessible - it is, card is listed as "enabled" and there aren't many other settings to do much else with.
  • Although all my Chelsios were erased and reflashed to the latest firmware last year, I tried reflashing the firmware + config files anyway (Chelsio website: download DOS Option Rom zip package v1.0.0.90)
  • Tried with and without options for 4GB PCIe decode, IOSRV and ASPM (made no difference)

The only outcome of these was that after reflashing, I now get a slightly updated dmesg, that the firmware isn't up to date, as well:
Code:
t4nex0: <Chelsio T420-CR> mem 0xfb100000-0xfb13ffff,0xfa800000-0xfaffffff,0xfb244000-0xfb245fff irq 26 at device 0.4 on pci2
t4nex0: firmware on card (1.15.37.0) is older than the version bundled with this driver, installing firmware 1.16.45.0 on card.
t4nex0: fw init failed: 5.
t4nex0: error during attach, adapter is now in recovery mode.



Any ideas what the issue is and how to fix it, or should I ask Chelsio for support?
 
Last edited:
D

dlavigne

Guest
It wouldn't hurt to create a report at bugs.freenas.org in case changes are needed for the FreeBSD driver. If you do, post the issue number here.
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
It wouldn't hurt to create a report at bugs.freenas.org in case changes are needed for the FreeBSD driver. If you do, post the issue number here.
The issues keep piling up. I'm in the middle of a heavy work cycle right now. What I might do is put this, and the ZeroWindow/bandwidth issues on my primary server, on a back burner for a while (could be as much as a few weeks, it's a really heavy workload and I'm doing 16 hour days for a while), and come back to them both when I can take the time to do more troubleshooting, such as swapping Chelsio cards between them. Right now it's manageable and I need it to stay that way until the busy period is over :)

Hoping that would be OK with you? I was kind of hoping for a "quick fix" so I could test the main server. If there are any useful commands that might expose what the fw init issue means, I'm happy to try them. But I don't feel right putting in a bug report on issues until I've eliminated all I can here, and I can't finish doing that until I can crack open any other server with a Chelsio card and not have a problem with downtime if it goes wrong :)
 
Status
Not open for further replies.
Top