System Keeps crashing, rebooting

DasGoG

Dabbler
Joined
Feb 17, 2021
Messages
27
Hello!

So I am new to troubleshooting HD etc through TrueNAS.

Motherboard - AsRock Rack X470D4U
Memory - 2 x 8 Gig Crucial NON-ECC
I have 1 x 16 Gig ECC but MB won’t even try to
boot (another story)
HD - 5 x 4 Gig Seagate NAS 5400 RPM Ironwolf
Onboard SATA usage
750 Watt PSU
Ryzen 5 3600 CPU
Onboard Video

It happens at random.... 4 days fine then bam reboots like several times. Then maybe a day. It’s all random.
I’ve had to reinstall/format cause of the destruction it caused. Lost the data. I disabled SMART (temp) and it stays up 100% no issues never a flaw in any of the data. Not saying it’s the fix, just saying, I haven’t lost data since.
My question is, I just don’t understand how to look at the errors or what commands to bring up the errors or how to troubleshoot.
Are there any good steps to doing troubleshooting steps 1 and so on?
 

ThreeDee

Guru
Joined
Jun 13, 2013
Messages
700
Have you popped the CMOS battery out to clear/reset BIOS?
What BIOS and BMC versions are you running? I'd recommend updating to the 3.50 BIOS and 2.20 BMC if you are not already. You can update BIOS via IPMI.
If you are not up to date, after updating .. then try your ECC RAM in slot A1 (2nd slot up from CPU per the manual)
What speed is your memory running at?
How are your temps? Any jails installed and if so what are they?

I had an initial RAM detection issue on my motherboard .. I purchased to identical sticks and one didn't work so I purchased another kit. While waiting for the new kit, I thought I'd at least get TrueNAS installed with the one 16gb "good" stick .. updated the BIOS to 3.50 and IPMI to 2.20 and installed TrueNAS .. on a whim, I tried my "dead" stick and it automagically was working now after many reseats and trying in different slots and system not booting when that particular stick was installed with the other stick or by itself ... I was flabbergasted.

When my new kit arrived .. I installed it too and it all worked! Soooo ... decided to keep everything and cancel the RMA .. Now I'm running 4 x 16GB of mismatched ECC RAM without issue.

long thread about other users experience with the X470D4U:

 

DasGoG

Dabbler
Joined
Feb 17, 2021
Messages
27
Morning and thank you for that information. I am unable to do so at this moment. I did indeed update bios etc prior to anything of this but did not remove/reseat battery, yet.
As far as memory goes it appears on the Crucial page of supported memory for this MB, my memory is “not” in the list so this may be the issue.. Also I did have it in A1 and B1 (tried both).
As far as jails, currently, only qbt for torrenting. When monitoring the system I see no obvious issues with heat.. But then clearly I don’t know how to read errors anyway.
Memory timing.. Hmmm, whatever the MB auto selects. The memory of the ECC is
Crucial Server Memory 16GB DDR4 DIMM 288-pin - 2666 MHz / PC4-21300 - CL19-1.2 V - unbuffered - ECC CT16G4WFD8266)
 

TheNicNet

Cadet
Joined
Sep 28, 2015
Messages
2
Did you ever manage to resolve this? I know it's an old thread, but I've got a build with the same MB and I just had 2 back to back random reboots.
 

ThreeDee

Guru
Joined
Jun 13, 2013
Messages
700
Did you ever manage to resolve this? I know it's an old thread, but I've got a build with the same MB and I just had 2 back to back random reboots.
I used to run a 3700x on my X470D4U and my system would become unresponsive randomly with PBO and CPB enabled .. disabling those and system was 100% stable running 24/7 .. I upgraded to a 3900x and now I can enable PBO and CPB but system would become unresponsive with C-States enabled, unlike the 3700x where I left C-States enabled .. disabling C-States and setup is 100% stable running 24/7 again

I had to manually reboot to recover from above mentioned errors .. I never got random reboots though .. that could be indicative of a power supply issue
 
Top