spurious reboot and POST beeps (FN 11.1 U6)

UK_Dave

Dabbler
Joined
Aug 24, 2015
Messages
20
Hi All,

Yesterday morning my freenas machine was beeping (5x short beeps 1x long - Supermicro board) and stalled at boot (PEI--Intel Reference Code Execution... 02). Online suggests a memory error yet in the IPMI it was still happy to tell me both RAM modules were present and what they were.

After work yesterday I took one RAM module out and rebooted.. same issue on boot. Swapped the module to the other slot and it booted up but sounded like it restarted at some point (I was in my loft and no console so going off the sounds of the drive spinning down etc). Logged in via IPMI and saw it was booting up and seemed to be working fine.

Shut it down and put both modules back in and it rebooted ok again however just after it imported the volume it restarted (which I suspect it did before I just wasn't in front of a monitor to catch it). After that it was ok.

I've got some screenshots of the bios screen when the beeps were going (memory beeps.png) and what the console showed when it triggers the restart (server restart.png) - there's nothing particularly helpful though...
Code:
ZFS volume imports complete
generating grub configuration file...
done
Shutdown NOW!


I ran memtest overnight (see memtest pass.png) which passed.

I'm now at a bit of a loss where to go next. I have many many PC's around but only the one that takes DDR4 EEC modules so testing them in another machine here will be tricky, likewise trying other RAM. I might be able to test them in a machine at work but they may not like the idea of me putting potentially suspect RAM into a machine! These are Samsung modules and annoyingly seem to only have 12 months warranty (although a slightly different part number with the exact same specs have 36 months warranty) and they were bought over a year ago.

Machine specs:

Supermicro X11SSH-CTF-O
Intel Xeon E3-1245v6
2x Samsung 16GB DDR4-2400 ECC (M391A2K43BB1-CRC00)
550 W Seasonic Focus Gold PSU
6x WD RED 6TB in RAIDZ2
120GB Corsair Force MP500 NVMe M.2 boot drive
APC BackUPS (1500VA I think)

Freenas 11.1 U6

Other info:

I'm currently renovating and just before Christmas I had to clear my entire ground floor including my NAS and equipment which I've stored in the loft since Dec 23rd. I powered it back up on Monday night having now had the sockets made live and literally 2 days later this issue arises so it could well be related (the house is full of brick and plaster dust too although I've done my best to keep the machine sheltered from this). Thinking about it while writing this I'd not be surprised if the humidity was higher in the loft which might not be great for it. The loft was never intended to be it's long term home I just needed some files off it so set it up there temporarily.

Thanks for any advice.
Dave
 

Attachments

  • memory beeps.png
    memory beeps.png
    65.2 KB · Views: 403
  • memtest pass.png
    memtest pass.png
    43.8 KB · Views: 392
  • server restart.png
    server restart.png
    145.6 KB · Views: 393

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,079
I started working as a computer technician in 1996, after getting out of the Army, and in the years since then, I can't begin to count the number of times that a fault has been corrected by removing and reinstalling the suspect device. It is often the second step in trouble shooting hardware.
That system could run for another ten years and have no further problem.
I would suggest that you keep an eye on it, but don't worry too much.
 

UK_Dave

Dabbler
Joined
Aug 24, 2015
Messages
20
Thanks for the reply and advice. Yes I guess I'm maybe being overly cautious but equally if it's an early precursor to something going wrong then dealing with it sooner I'm more likely to have a chance with warranty etc. I also live in a semi detached house and am away fair a bit lately so don't want to annoy the neighbours too much if it starts alarming and I'm away all weekend!

As you say, I'll just have to keep an eye on it and fingers crossed just reseating the RAM was all it needed.

Cheers,
Dave
 
Top