Mini XL with Alzheimers

Status
Not open for further replies.

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
On a related note, does the RAM I’m using look right to you? The Asrock QVL list for my board is here. As best as I can tell my memory is listed as the third from the bottom.
Yes, I looked at it when we first discussed it and the numbers x-checked as far as I could see.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
I was starting to wonder whether I had some sort of hex-induced dyslexia.... :confused: thank you!

Good news is though that if the memory makes it through the stress test a couple of times that it might end up being something else like the PSU which was about to become obsolete.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
the CPU stress test will help identify a PS problem as well. Any yes, leave your hard drives attached. If you desired you could disconnect the SATA cables but leave power applied.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Almost 10 hours later, pass #2 is 92% complete with zero errors. More once the whole suite has been executed successfully.

As for the PS, while the hard drives draw some power idling, I suspect that a scrub would stress the PS far more because both the CPU and the drives would be working. As a next step, would it make sense to reboot into FreeNAS and execute a scrub if the memory checks out?
 
Last edited:

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
I would make sure any valued data was backed up. Doing a scrub while memory is having problems is not something that is likely to turn out well.
 
Last edited:

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Got it. Double backup then scrub. Just the backup alone might stress the server enough to make some of these problems re-appear.

However, I’d also like to think that after a day of continuously hammering literally every bit in the memory banks that we can likely exclude memory from the usual suspects for now.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Memory would still be the first on my suspect list. First, the old memory worked okay. Second, this was from ebay at something like a third of the going rate.

Probably the thing to do is shut the system down, put in the old memory, then do the backups. Then try a scrub. If that works, then replace the memory and try the scrub again.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Yup. Great suggestion to minimize the variables in play. Besides, I had to perform multiple backups anyway given that my new case will allow a 11-drive z3 and I’ll have to nuke the existing pool to do that.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
As for the PS, while the hard drives draw some power idling, I suspect that a scrub would stress the PS far more because both the CPU and the drives would be working. As a next step, would it make sense to reboot into FreeNAS and execute a scrub if the memory checks out?
I doubt a scrub uses as much power as a CPU stress test. While doing your scrub is the CPU running at 100%? The hard drives are not consuming much more power from idle to active unless they are not spinning. If you feel like it then run both the CPU stress test first and then the scrub. Just make sure the RAM has passed for a few days. Watch the CPU temp during your testing. If you also have a high CPU temp then you too have an issue that needs to be addressed.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
The nightly emails are kind of like a scrub. One or two tests runs a find on the filesystems, looking for setuid binaries and ...maybe something else, I forget. Anyway, the drives are going to be doing lots of random seeks.

For that matter, you could run those directly with periodic daily. That will be exactly the same situation as when it had problems before.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Well, the unit passed all memory tests, all was well, zero errors across the board. Shut it down, swapped the memory for the OEM stuff, rebooted.

System init took longer and fan behavior was weird - rear fan would occasionally spin then stop, etc. before coming on 80% as per BIOS setting. First time, the unit made it to "syncer" on the console screen before it rebooted itself. Some more back and forth, now the unit claims to be running but the web server for the FreeNAS is down (yet the IPMI server is working just fine). Voltages look OK.

The CPU temperature is below 40*C so that's not it either.

So I guess my next step will be to nuke the LAN interfaces again and see if that fixes the problem.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Well, whatever it is, the web server claims to be up yet neither interface (Chelsio or on board) replies to pings, while the IPMI interface does.

Is a factory reset the only way out?
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Wait... that sounds like one of the motherboard issues. Which could also explain the other problems. Has the motherboard in this system been replaced?
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
No replacements here yet. It’s the original motherboard, purchased along with the Mini XL it sits in directly from iXsystems via Amazon on Sept. 23, 2016.

Is it time to give IXSystems a call?
 
Last edited:

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Hmmm.... just about 2 years. Fits the failure pattern. I would indeed give iXsystems a call.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
OK, I'll call on IXsystems and see what they say.

EDIT: Can't say enough good thinks about iXsystems support. The registration was painless and my ticket was answered in less than 15 minutes!

Now working my way through an advanced RMA process, the only thing left (I think!) is communicating my credit card number to them. I feel fortunate to have purchased this server from iXsystems!
 
Last edited:
Status
Not open for further replies.
Top