[HELP] TrueNAS SCALE Kernel Panic After Second Processor Install Xeon 3.1GHz 20MB 8-Core

alexmadsen1

Cadet
Joined
Feb 17, 2024
Messages
3
Subject: [HELP] TrueNAS SCALE Kernel Panic After Second Processor Install

Hi everyone,

I'm facing a challenging issue with my TrueNAS SCALE setup where it crashes with a kernel panic during boot, but only after installing a second processor. This setup worked perfectly when running TrueNAS CORE, but the problem started immediately after upgrading to SCALE.

**System Specifications:**
- Motherboard: SuperMicro X9DRD-EF Dual Socket ATX Server
- Processors: 2x Intel Xeon E5-2687W SR0KG 3.1GHz 20MB 8-Core LGA2011
- Previous OS: TrueNAS CORE 13.0-U6.1 (no issues)
- Current OS: TrueNAS SCALE 22.12.4.2 and TrueNAS-SCALE-23.10.1.3 (kernel panic on boot)

**Symptoms:**
- Boots fine with one processor.
- Kernel panic when the second processor is installed.
- Error during boot: "Kernel panic - not syncing: Attempted to kill the idle task!"

**Troubleshooting Steps Taken:**
- Confirmed both processors are functional individually.
- Rechecked all connections and ensured hardware compatibility.
- Removed and swapped all the RAM, reducing to only 2 modules per processor, and then swapped it with known good memory.

**Error Message Snippet:**
[ 1.748850] Modules linked in:
[ 1.748924] end trace 0000000000000000
[ 2.153205] tsc: Refined TSC clocksource calibration: 3099.999 MHz
[ 2.153293] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2caf46e03c8, max_idle_ns: 440795329092 ns
[ 2.215723] RIP: 0010:switch_mm_irgs_off+0x431/0x480
[ 2.215804] Code: 48 83 c1 10 66 83 f8 06 75 de 65 c6 05 0b a7 7a 57 00 e9 c9 fc ff ff 48 8b 05 93 49 37 01 b9 49 00 00 00 48 89 c2 48 c1 ea 20 0f 30 e9 58 fc ff ff 0f 0b e9 a8 fc ff ff 65 48 c7 05 c5 a6 7a 57
[ 2.215930] RSP: 0018:ffffa67ac8257e50 EFLAGS: 00010046
[ 2.216003] RAX: 0000000000000001 RBX: ffff8a6fd0070000 RCX: 0000000000000049
[ 2.216082] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa9b4eb11
[ 2.216159] RBP: 000000000000000f R08: 0000000000000000 R09: 0000000000000000
[ 2.216235] R10: 000000000000000f R11: 0000000000000001 R12: ffffffffaa35c620
[ 2.216312] R13: ffff8a6fd0070000 R14: 0000000043490000 R15: ffff8a6fd3ac1980
[ 2.216388] FS: 0000000000000000(0000) GS:ffff8a96efdc0000(0000) knlGS:0000000000000000
[ 2.216481] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2.216555] CR2: 0000000000000000 CR3: 0000000256281001 CR4: 00000000000606e0
[ 2.216633] Kernel panic - not syncing: Attempted to kill the idle task?
[ 2.216728] Kernel Offset: 0x27800000 from 0xffffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 2.317764] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---


I'm at a loss since the hardware configuration hasn't changed from when I was running CORE. Does anyone have insights on why SCALE might be reacting differently, or tips on how to troubleshoot this kernel panic issue?

Thanks so much for any help you can provide!

alex
 

Attachments

  • Capture.JPG
    Capture.JPG
    114.1 KB · Views: 86

alexmadsen1

Cadet
Joined
Feb 17, 2024
Messages
3
Progress, I now got it to but by changing the following. by editing the advanced setup (thank you GPT for the the recommendation). changed "amd_iommu=off" and added "nomodst" and added "maxcpus=1"
1708200738934.png
 

alexmadsen1

Cadet
Joined
Feb 17, 2024
Messages
3
NUMA (NON-Uniform Memory Access) Disabled does not help.
It seems to be a multi-processor thing. I can run up "maxcpus=8" as soon as i go to "maxcpus=9" i get the kernel panic issue again.
 
Top