TrueNAS crash every few hours and reboot

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
If you were using RAID in the strict sense, that could be the cause of your problems. That is why we insist on proper terminology here. Best of luck finding and fixing your problem. I have nothing to add to the comments of the other regulars, unfortunately.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
You are correct, yes. But I can't believe this. Because I never had defective hardware before. And the hardware in this server is new. But anyway, I will check this.
This means nothing. I never had a CPU fail on me before in 20 years of building computers, but last month, the Ryzen 5600X in my gaming rig rebooted during a gaming session and now when it rebooted, the motherboard told me one of the cores got too hot (I run only stock settings). And now it refuses to boot any OS I throw at it. It either hangs or I get stuck in a boot loop or blue screen/kernel panic.

BTW, that screenshot does not look good. I mean, large red letters spelling out "FAIL" is generally not a good sign...
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
But I think the RAM is the problem. Can someone confirm?
Indeed, that failed memtest confirms it. You have bad RAM in your system.

Now, the problem may be from the motherboard, from the memory bank, from the memory chips, ...

What you can do is to remove all but 1 or 2 of your RAM module and re-test. Are these good ? If not, try to move them in another memory bank. Does the test succeed in these ones ?

If you buy new RAM when it is your memory bus or bank that is bad, you will just have waste time and money.

Defective hardware like yours requires a complete an precise test of each and every component.
 

Kienaba

Explorer
Joined
May 24, 2022
Messages
52
If you were using RAID in the strict sense, that could be the cause of your problems. That is why we insist on proper terminology here. Best of luck finding and fixing your problem. I have nothing to add to the comments of the other regulars, unfortunately.
But I already posted the RAM test results, so isnt it clear, that this is the reason? I also have no logs, so a hardware error could cause this.
 
Top