nickt
Contributor
- Joined
- Feb 27, 2015
- Messages
- 129
*** EDIT 24 October 2016 ***
My problem is resolved now: the board was toast, suffering from - what I now understand to be - a well known problem where patting the watchdog causes the BMC flash memory to wear out, preventing the board from powering up. If you've got this board, and it's still operational, take a look at my "how to" guide on disabling the watchdog to avoid your board dying too. This is obviously a firmware bug on the board; at this point, there is no acknowledgment by AS Rock of the issue. I'll update my post if and when a firmware fix is issued.
*** EDIT 25 October 2016 ***
William from AS Rock tells me that the BMC team in their global HQ are aware of this issue and are working on a firmware fix. No ETA for the new firmware at this point
*** EDIT 19 February 2017 ***
And AS Rock has finally provided a firmware fix. Should be able to use the watchdog as you please with 00.30.00 BMC firmware (and above). Thanks to Dale for calling it out and verifying.
*** /EDIT ***
Hi everyone,
I'm fairly sure that the ASRock C2750D4I motherboard in my FreeNAS box has died without reason - after 15 months flawless service - but it's hard to be sure as diagnostics / IPMI isn't giving much away. I'm hoping someone might be able to help me decide for sure.
I'm also very disappointed in ASRock Rack's seemingly non-existent customer service. I've tried a few times now to contact technical support (http://event.asrockrack.com/tsd.asp and sales email address) and I've had absolutely no response (two weeks later) other than a robot confirming that I sent a support request. Is this a common experience? I had expected somewhat better...
So... almost without warning (see comment on CPU temperature alarms), my FreeNAS server stopped, and won't power on. After pressing the power button, the power supply starts up, but then cuts out after ~3.5 seconds. There's no output to the VGA and no POST. Subsequent presses of the power button do nothing: I have to remove the power cable from the power supply before attempting again. IPMI works fine, but provides very little information on what might be wrong - there's nothing useful in the logs at all.
I have tried the following:
* Substituted power supply for another known working unit
* Confirmed that my server power supply is able to power another PC without issue
* Progressively removed all USB devices and SATA drives
* Progressively removed all case connections (fans, LEDs etc), with the exception of the power button
* Removed RAM and tried one at a time in the A1 slot
* Attempted with no RAM inserted at all
* Reset CMOS by removing power and battery and holding the power button for more than 30 seconds
In all cases, symptoms are identical: power supply fires up for about 3.5 seconds then cuts out. No VGA output, no POST.
IPMI shows very little; the only clue I have is a bunch of CPU temperature alarms issued just before the server failed. I’ve never had CPU temperature alarms before; there was no load on the CPU (or IO) at the time. The server always runs very cool and is in a cool environment.
I'd really appreciate if anyone has any suggestions for any further diagnosis I could try. I'm pretty sure the motherboard is toast, but I can't completely rule out that the RAM isn't the problem, as I don't have another ECC capable motherboard.
My server configuration is as follows:
Motherboard: ASRock C2750D4I
RAM: 2x 8 GB Crucial DDR3L 1600 (PC3L 12800) ECC Unbuffered CT2KIT102472BD160B
HDDs: 6x 3TB Western Digital Green WD30EZRX
Case: Fractal Design Node 304
Power supply: A1-3000 420W PSU
BIOS version 2.80
BMC version 00.23.00
My problem is resolved now: the board was toast, suffering from - what I now understand to be - a well known problem where patting the watchdog causes the BMC flash memory to wear out, preventing the board from powering up. If you've got this board, and it's still operational, take a look at my "how to" guide on disabling the watchdog to avoid your board dying too. This is obviously a firmware bug on the board; at this point, there is no acknowledgment by AS Rock of the issue. I'll update my post if and when a firmware fix is issued.
*** EDIT 25 October 2016 ***
William from AS Rock tells me that the BMC team in their global HQ are aware of this issue and are working on a firmware fix. No ETA for the new firmware at this point
*** EDIT 19 February 2017 ***
And AS Rock has finally provided a firmware fix. Should be able to use the watchdog as you please with 00.30.00 BMC firmware (and above). Thanks to Dale for calling it out and verifying.
*** /EDIT ***
Hi everyone,
I'm fairly sure that the ASRock C2750D4I motherboard in my FreeNAS box has died without reason - after 15 months flawless service - but it's hard to be sure as diagnostics / IPMI isn't giving much away. I'm hoping someone might be able to help me decide for sure.
I'm also very disappointed in ASRock Rack's seemingly non-existent customer service. I've tried a few times now to contact technical support (http://event.asrockrack.com/tsd.asp and sales email address) and I've had absolutely no response (two weeks later) other than a robot confirming that I sent a support request. Is this a common experience? I had expected somewhat better...
So... almost without warning (see comment on CPU temperature alarms), my FreeNAS server stopped, and won't power on. After pressing the power button, the power supply starts up, but then cuts out after ~3.5 seconds. There's no output to the VGA and no POST. Subsequent presses of the power button do nothing: I have to remove the power cable from the power supply before attempting again. IPMI works fine, but provides very little information on what might be wrong - there's nothing useful in the logs at all.
I have tried the following:
* Substituted power supply for another known working unit
* Confirmed that my server power supply is able to power another PC without issue
* Progressively removed all USB devices and SATA drives
* Progressively removed all case connections (fans, LEDs etc), with the exception of the power button
* Removed RAM and tried one at a time in the A1 slot
* Attempted with no RAM inserted at all
* Reset CMOS by removing power and battery and holding the power button for more than 30 seconds
In all cases, symptoms are identical: power supply fires up for about 3.5 seconds then cuts out. No VGA output, no POST.
IPMI shows very little; the only clue I have is a bunch of CPU temperature alarms issued just before the server failed. I’ve never had CPU temperature alarms before; there was no load on the CPU (or IO) at the time. The server always runs very cool and is in a cool environment.
I'd really appreciate if anyone has any suggestions for any further diagnosis I could try. I'm pretty sure the motherboard is toast, but I can't completely rule out that the RAM isn't the problem, as I don't have another ECC capable motherboard.
My server configuration is as follows:
Motherboard: ASRock C2750D4I
RAM: 2x 8 GB Crucial DDR3L 1600 (PC3L 12800) ECC Unbuffered CT2KIT102472BD160B
HDDs: 6x 3TB Western Digital Green WD30EZRX
Case: Fractal Design Node 304
Power supply: A1-3000 420W PSU
BIOS version 2.80
BMC version 00.23.00
Last edited: