CompuGlobalHyperMegaNet
Contributor
- Joined
- Sep 13, 2014
- Messages
- 149
Hardware details for my main "X10" server are in my sig.
I've just tried to access an SMB share via Windows and found that it was inaccessible. I went through the usual Windows malarkey to diagnose such problems but when I tried logging into the FreeNAS WebGUI, I found that it too was inaccessible. Sure enough, the FreeNAS server was powered off.
I logged in to the IPMI interface via a web browser and checked the event log but found nothing (although I have a suspicion that the reason I found zilch may have been my fault). I then rebooted the server and logged in via the WebGUI. There were no flashing red icons/alerts so I then checked /var/log/messages. There wasn't anything that seamed particularly important (but I'll quote them below non the less) other than giving me an idea of an approximate time when the server went down.
My next step was to check the Event Log using IPMIview... that's where things got interesting.
Ignore that FAN related events. I'm using the fans that came with my case, which the X10SL7-f doesn't play nice with, so despite not being able to actually hear it, apparently that fans ramp up and down semi regularly... but I must stress that the server is in a converted attic and we're heading into Autumn in the UK, so heat is not an issue (even in the height of summer for that matter).
What caught my eye was the last two entries concerning the watchdog. I did have have the motherboard jumper set to "reset" (pins 1 and 2) but I'm 99.9% sure I had it disabled in the BIOS. Either way, it's disabled now and the jumper has been removed in order to disable Watchdog.
So... where do I go from now? Are there any other logs I should check?
I do have a PSU tester... just a basic thing that tests whether the voltages and power good signal are within tolerances. Does anyone think it's worth testing the PSU?
I've just tried to access an SMB share via Windows and found that it was inaccessible. I went through the usual Windows malarkey to diagnose such problems but when I tried logging into the FreeNAS WebGUI, I found that it too was inaccessible. Sure enough, the FreeNAS server was powered off.
I logged in to the IPMI interface via a web browser and checked the event log but found nothing (although I have a suspicion that the reason I found zilch may have been my fault). I then rebooted the server and logged in via the WebGUI. There were no flashing red icons/alerts so I then checked /var/log/messages. There wasn't anything that seamed particularly important (but I'll quote them below non the less) other than giving me an idea of an approximate time when the server went down.
Code:
Oct 17 11:00:12 freenas-x10 autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot -r "tank/Youtube_Projects@auto-20181017$ Oct 17 11:00:12 freenas-x10 autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs destroy -r -d "tank/Youtube_Projects@auto-201810$ Oct 17 11:34:19 freenas-x10 alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name Oct 17 11:34:19 freenas-x10 alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name Oct 17 12:34:28 freenas-x10 alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name Oct 17 12:34:28 freenas-x10 alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name Oct 17 12:48:12 freenas-x10 autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot -r "tank@auto-20181017.1248-2m" Oct 17 12:48:13 freenas-x10 autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs destroy -r -d "tank@auto-20180818.1248-2m" Oct 17 13:00:12 freenas-x10 autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot -r "tank/Youtube_Projects@auto-20181017$ Oct 17 13:00:12 freenas-x10 autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs destroy -r -d "tank/Youtube_Projects@auto-201810$ Oct 17 13:04:10 freenas-x10 devd: notify_clients: dropping unresponsive client: Broken pipe Oct 17 13:04:11 freenas-x10 kernel: epair0a: link state changed to DOWN Oct 17 13:04:11 freenas-x10 kernel: epair0a: link state changed to DOWN Oct 17 13:04:11 freenas-x10 kernel: epair0b: link state changed to DOWN Oct 17 13:04:11 freenas-x10 kernel: epair0b: link state changed to DOWN Oct 17 17:28:53 freenas-x10 syslog-ng[1688]: syslog-ng starting up; version='3.6.4' Oct 17 17:28:53 freenas-x10 Copyright (c) 1992-2016 The FreeBSD Project. Oct 17 17:28:53 freenas-x10 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Oct 17 17:28:53 freenas-x10 The Regents of the University of California. All rights reserved. Oct 17 17:28:53 freenas-x10 FreeBSD is a registered trademark of The FreeBSD Foundation. Oct 17 17:28:53 freenas-x10 FreeBSD 10.3-STABLE #0 r295946+21897e6695f(HEAD): Thu Jun 8 20:02:12 UTC 2017
My next step was to check the Event Log using IPMIview... that's where things got interesting.
Code:
502,System Event,05/01/2018 03:21:53 Tue,Fan,FANA,De-assertion: Lower Critical - going low 503,System Event,05/01/2018 04:18:45 Tue,Fan,FANA,Assertion: Lower Critical - going low 504,System Event,05/01/2018 04:18:48 Tue,Fan,FANA,De-assertion: Lower Critical - going low 505,System Event,05/01/2018 05:05:22 Tue,Fan,FANA,Assertion: Lower Critical - going low 506,System Event,05/01/2018 05:05:25 Tue,Fan,FANA,De-assertion: Lower Critical - going low 507,System Event,05/01/2018 05:28:14 Tue,Fan,FANA,Assertion: Lower Critical - going low 508,System Event,05/01/2018 05:28:17 Tue,Fan,FANA,De-assertion: Lower Critical - going low 509,System Event,05/01/2018 06:12:22 Tue,Fan,FANA,Assertion: Lower Critical - going low 510,System Event,05/01/2018 06:12:25 Tue,Fan,FANA,De-assertion: Lower Critical - going low 511,System Event,10/17/2018 12:05:01 Wed,Watchdog 2,,Assertion: Watchdog 2| Event = Timer interrupt 512,System Event,10/17/2018 12:05:02 Wed,Watchdog 2,,Assertion: Watchdog 2| Event = Hard Reset
Ignore that FAN related events. I'm using the fans that came with my case, which the X10SL7-f doesn't play nice with, so despite not being able to actually hear it, apparently that fans ramp up and down semi regularly... but I must stress that the server is in a converted attic and we're heading into Autumn in the UK, so heat is not an issue (even in the height of summer for that matter).
What caught my eye was the last two entries concerning the watchdog. I did have have the motherboard jumper set to "reset" (pins 1 and 2) but I'm 99.9% sure I had it disabled in the BIOS. Either way, it's disabled now and the jumper has been removed in order to disable Watchdog.
So... where do I go from now? Are there any other logs I should check?
I do have a PSU tester... just a basic thing that tests whether the voltages and power good signal are within tolerances. Does anyone think it's worth testing the PSU?