Stress Testing new FreeNAS 9.3

Status
Not open for further replies.

wreedps

Patron
Joined
Jul 22, 2015
Messages
225
Supermicro X9SRL with Xeon 2670 Running 9.3 Latest build
128GB ECC DDR
8x 900GB 10K SAS with mirrored dev. 128GB Extreme Pro Cache SSD.
2x Dual Port Chelsio 10Gbe directly connected to 2 ESXi 6 hosts.


What are some stress tests I could do to this box to test it out before placing into production?
I have done some simple testing. I can clone a 2008 R2 template in 36sec to 1 min on it. Pretty impressive.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Run Memtest86 for 3 days straight.
Run a CPU stress test for 30 minutes, but not longer or you risk CPU damage.

I can't think of anything else if you can already move data around.

EDIT: How about some long term data transfers that last say 1 day to verify the components can handle the continuous stress? This would test the LAN ports and associated parts.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Why the CPU would be damaged?
If you run say Prime95, you are pushing the CPU to 100% capacity and even with outstanding thermal design, it's going to shorten the life of the CPU due to the extreme heat being created. I can't say how much shorter the life would be, I've never seen any papers published on it, only that 30 minutes is more than enough time to build up your maximum heat and run the CPU to see if it has an error. Also I have read that your Northbridge chip can overheat as well. I can recall in the older days you could run a CPU stress test but back then CPUs either didn't have a heatsink or the heatsink was passive, just because there wasn't a lot of heat generated. Todays CPUs and chipsets are so different.

Anyway, I hope that answers it, and I wish I had something more like a factual paper to point to but I don't.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
If something fails because you've run a stress test more than 1/2 h then there's problem in the design. It should be capable of running at 100 % for years without any problems.

Of course higher temp = lower lifetime but 1/2 h or a few hours (or even days) won't make a difference :)
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
I usually run prime95 for many hours on all of my home builds after hardware assembly. Did that with my client PCs as well as with my NAS. Yes, should run for years if cooling and the underlying hardware don't have problems.
 

wreedps

Patron
Joined
Jul 22, 2015
Messages
225
I have Memtest running right now. Server is on a UPS. I will let it run for a few days. Will it stop? Do I need to check on it?
I will look at it in the AM.
 

rsquared

Explorer
Joined
Nov 17, 2015
Messages
81
It will run until you stop it. I typically check it after several hours to see if it completed the first pass and whether any errors were detected. After I've seen that the first pass was good, I'll check it once a day or so just to see if any errors popped up.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I'm more than happy to agree to dis-agree about how long to run a CPU stress test and it's affects on the CPU longevity, but my opinion is 30 minutes should be sufficient.

I have Memtest running right now. Server is on a UPS. I will let it run for a few days. Will it stop? Do I need to check on it?
I will look at it in the AM.
You should periodically check the screen to ensure there is activity and the system hasn't crashed. Of course any errors would not be a good thing and you would need to identify if tweaking on the BIOS settings would fix it or if you just have a bad/incompatible RAM module. With any luck it just keeps cycling through RAM and no errors pop up.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep, even if you can do it for days, 30 min should be enough (for air-cooling) ;)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Actually, I hate liquid cooling period. I don't even believe in it for gaming machines because of the higher risks of water leaks, the additional maintenance, etc. :P
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
If 100% load on your CPU for thirty lousy minutes - especially on a server - is enough to damage it, I'd have a whole lot of dead machines laying around.

Modern processors will throttle their performance if they reach a certain internal temperature, and that's what you need to be looking for. If you can't sustain the minimum guaranteed clock speed, your cooling setup is not up to the task and you need to fix it, whether that's with new thermal interface material, improved or redesigned airflow, or environmental changes.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If 100% load on your CPU for thirty lousy minutes - especially on a server - is enough to damage it, I'd have a whole lot of dead machines laying around.

Modern processors will throttle their performance if they reach a certain internal temperature, and that's what you need to be looking for. If you can't sustain the minimum guaranteed clock speed, your cooling setup is not up to the task and you need to fix it, whether that's with new thermal interface material, improved or redesigned airflow, or environmental changes.

This. Saved me from tapatalking a long reply.

We routinely beat the crap out of silicon for a decade. CPU or RAM failing after the first 90 days is very unusual, at least compared to HDD, fan, and maybe PSU failure.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
There's one big advantage to liquid cooling: The good stuff is much easier to install than your average air cooler. I particularly hate the Intel stock coolers' lack of proper tactile feedback during installation - I'm never quite sure if it's sitting right.

That said, everything but my workstation is air-cooled around here.

If 100% load on your CPU for thirty lousy minutes - especially on a server - is enough to damage it, I'd have a whole lot of dead machines laying around.

Modern processors will throttle their performance if they reach a certain internal temperature, and that's what you need to be looking for. If you can't sustain the minimum guaranteed clock speed, your cooling setup is not up to the task and you need to fix it, whether that's with new thermal interface material, improved or redesigned airflow, or environmental changes.
Very true.

On Intel platforms, this is handled by the black magic mode of the processor (also known as System Management Mode) and is almost an entirely separate system. The interactions are quite hard to follow these days, but, on a server board, it's going to involve the Management Engine, the BMC, system firmware (UEFI) and possibly others.
 

wreedps

Patron
Joined
Jul 22, 2015
Messages
225
Long enough????
 

Attachments

  • IMG_1267.JPG
    IMG_1267.JPG
    237.9 KB · Views: 418

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Long enough????

Seventy hours and three passes? Not really, at least not here. Part of the burn in process is to see what problems appear, so running memtest for a *month* is quite reasonable. Happy to see that you had all CPU's active though.
 

wreedps

Patron
Joined
Jul 22, 2015
Messages
225
Well I am booting it back up. I have customers that need space now. Appreciate the quick response though!
 
Status
Not open for further replies.
Top