FreeNAS MCA Errors?

Status
Not open for further replies.
Joined
Sep 25, 2015
Messages
3
Hi All

We have two large Storage servers in production both with DAS/SAN Connected and we have been getting these errors, i have run a memtest and the internal Dell Memory tests on the CPU/Memroy and nothing shows up

SVRSTRGDCL02.local kernel log messages:

sonewconn: pcb 0xfffff80176571000: Listen queue overflow: 193 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff80176571000: Listen queue overflow: 193 already in queue awaiting acceptance (114 occurrences)
MCA: Bank 8, Status 0x8c0000400001009f
MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 32
MCA: CPU 0 COR (1) RD channel ?? memory error
MCA: Address 0x196883ee40
MCA: Misc 0xe83a802000046343
MCA: Bank 8, Status 0x8c0000400001009f
MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 32
MCA: CPU 0 COR (1) RD channel ?? memory error
MCA: Address 0x196885ee40
MCA: Misc 0xe83a802000040180
 
Joined
Sep 25, 2015
Messages
3
And this is from our 2nd unit....

SVRSTRGDCL01.local kernel log messages:

sonewconn: pcb 0xfffff801d058f740: Listen queue overflow: 193 already in queue awaiting acceptance (50 occurrences)

-- End of security output --
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
This looks like a network adapter problem.
What kind of NIC and is it a card or built-in?

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 
Joined
Sep 25, 2015
Messages
3
It is using the inbuilt NIC on a R510 (Intel adapters) setup as a LAGG interface - The other worry error is the memory error though.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
In theory, a memory tester utility or even the built-in firmware should identify the problem RAM. Otherwise, the old binary-search algorithm can be used: remove half the RAM. If the problem recurs, it is in the memory still in the machine. If the problem goes away, it is in the memory that was removed. Keep testing, removing half the RAM each time, until the problem is located to a component. One potential problem is that the change can hide things. Less memory means less power needed, so a failing power supply that is the real problem could be hidden. Or just removing the DIMMs and reinstalling them could clear up an intermittent contact.
 
Joined
Jul 3, 2015
Messages
926
I've had the sonewconn errors too on three different systems (albeit very similar builds) but NOT the MCA errors you're seeing. It started in May and only as recently as 21st Nov but I don't get them all the time just quite randomly. All my systems run the Chelsio T520-CR with both ports bonded in LACP mode.

examples:

sonewconn: pcb 0xfffff8026de95690: Listen queue overflow: 8 already in queue awaiting acceptance (1 occurrences)
sonewconn: pcb 0xfffff8027082c5a0: Listen queue overflow: 8 already in queue awaiting acceptance (1 occurrences)
sonewconn: pcb 0xfffff802727e32d0: Listen queue overflow: 8 already in queue awaiting acceptance (1 occurrences)
sonewconn: pcb 0xfffff802727e32d0: Listen queue overflow: 8 already in queue awaiting acceptance (4205 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (4803 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (3664 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (1893 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (3623 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (980 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (3614 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (1061 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (3539 occurrences)
sonewconn: pcb 0xfffff802148645a0: Listen queue overflow: 8 already in queue awaiting acceptance (1678 occurrences)
sonewconn: pcb 0xfffff802727e32d0: Listen queue overflow: 8 already in queue awaiting acceptance (498 occurrences)
sonewconn: pcb 0xfffff802727e32d0: Listen queue overflow: 8 already in queue awaiting acceptance (18 occurrences)
sonewconn: pcb 0xfffff802727e32d0: Listen queue overflow: 8 already in queue awaiting acceptance (27 occurrences)
sonewconn: pcb 0xfffff80184b9b0f0: Listen queue overflow: 8 already in queue awaiting acceptance (1 occurrences)
sonewconn: pcb 0xfffff80184b9b0f0: Listen queue overflow: 8 already in queue awaiting acceptance (3173 occurrences)
sonewconn: pcb 0xfffff80184b9b0f0: Listen queue overflow: 8 already in queue awaiting acceptance (3137 occurrences)
sonewconn: pcb 0xfffff82856c72870: Listen queue overflow: 8 already in queue awaiting acceptance (1141 occurrences)
sonewconn: pcb 0xfffff82856c72870: Listen queue overflow: 8 already in queue awaiting acceptance (3883 occurrences)
sonewconn: pcb 0xfffff82856c72870: Listen queue overflow: 8 already in queue awaiting acceptance (2589 occurrences)
sonewconn: pcb 0xfffff82856c72870: Listen queue overflow: 8 already in queue awaiting acceptance (2072 occurrences)
sonewconn: pcb 0xfffff82856c72870: Listen queue overflow: 8 already in queue awaiting acceptance (2321 occurrences)
sonewconn: pcb 0xfffff802727e32d0: Listen queue overflow: 8 already in queue awaiting acceptance (9 occurrences)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
The other worry error is the memory error though.
The specs say this unit should have 8 memory slots. Is that correct for your system? How many are populated?
 
Joined
Jul 3, 2015
Messages
926
Joined
Jul 3, 2015
Messages
926
Hi All

We have two large Storage servers in production both with DAS/SAN Connected and we have been getting these errors, i have run a memtest and the internal Dell Memory tests on the CPU/Memroy and nothing shows up

SVRSTRGDCL02.local kernel log messages:

sonewconn: pcb 0xfffff80176571000: Listen queue overflow: 193 already in queue awaiting acceptance (6 occurrences)
sonewconn: pcb 0xfffff80176571000: Listen queue overflow: 193 already in queue awaiting acceptance (114 occurrences)
MCA: Bank 8, Status 0x8c0000400001009f
MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 32
MCA: CPU 0 COR (1) RD channel ?? memory error
MCA: Address 0x196883ee40
MCA: Misc 0xe83a802000046343
MCA: Bank 8, Status 0x8c0000400001009f
MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 32
MCA: CPU 0 COR (1) RD channel ?? memory error
MCA: Address 0x196885ee40
MCA: Misc 0xe83a802000040180
Out of interest do you use DFS to connect to them?
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
You can look up the MCA code and it will tell you what bank caused the error. You should be able to use that to isolate the DIMM in question.
 
Status
Not open for further replies.
Top