Cooling for a Dell PERC H310

Status
Not open for further replies.
Joined
Apr 9, 2015
Messages
1,258
My board has the SAS controller on board. With the chipset right next to it I just mounted a 80mm fan to the both of them to keep them constantly cooled.

@Black Ninja is that a battery backup running to an actual raid card I see.... I can see the ram chips under the labels on the left hand side of that card and with SAS cables hooked up I really hope you are not running ZFS on that.
 
Joined
Nov 11, 2014
Messages
1,174
My board has the SAS controller on board. With the chipset right next to it I just mounted a 80mm fan to the both of them to keep them constantly cooled.

@Black Ninja is that a battery backup running to an actual raid card I see.... I can see the ram chips under the labels on the left hand side of that card and with SAS cables hooked up I really hope you are not running ZFS on that.

It's a super-cap ,but it could've been a battery too.

What is this means "I hope you are not running zfs". I could put a HBA remove the battery and run zfs and this will run even colder than anything you can imagine. If I can cool 9271 this way , I can freeze 9211. The point is not about the file system , it's about how to cool card (raid/hba) the best way possible. I can't think of better way , in 1U chassis. No extra fans were added and the chip is 46 C.
 

Evertb1

Guru
Joined
May 31, 2016
Messages
700

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thank you for all the Information in this Thread!

But thinking about this, wouldn't it be better for a small Home User to use (Consumer) Marvell Cards instead of LSI? I'm sure the LSI are much faster than normal consumer stuff, but they also use much more energy and generate much more heat than consumer controllers?

The problem with NAS is that the interface to the HDD's has to be as close to 100% perfect as possible, and to date there are really only two things that usually fall into "acceptable" category: one is the built-in SATA on Intel PCH's, the other is LSI HBA with all the quirky version and firmware caveats.

In the early years of FreeNAS 8, there were lots of people running dodgy boards and questionable setups with awful ethernet chipsets, and a lot of time and effort was spent in the forums trying to debug hardware issues, which led me to rage-write "So you want some hardware suggestions." This represented a turning point here in the forums, and we nearly stopped seeing this awful parade of AMD APU systems with 4GB of RAM and Realtek ethernets and problems that came from that.

You are trusting all of your data to be stored on a cooperating set of hard disks. This is a complex and in some ways fragile sort of thing, where you really don't want drives dropping off line, bad data being spewed onto your disks, or driver issues to cause the NAS to hang for seconds (or minutes) at a time. I can absolutely guarantee you that there are other SATA chipsets that are perfectly usable with FreeNAS, but the consumer PC market has a massive problem with quality control, multiple versions being released as the same part number, and also issues such as knock-offs, so that the "Marvell" or "Realtek" controller you get might not be legit, and might only barely work with the normal Windows drivers, and so actually finding an add-in card that is *going* to be reliable is relatively challenging, and next month the next shipment of cards may be based on a different chip made on a different dirt-cheap Shenzhen assembly line.

One of the big things that the LSI HBA's have going for them are that there are many millions of cumulative hours of run-time that FreeNAS users have that validate the silicon, firmware, and driver combination are stable and reliable. If you go down to the corner shop and get a two port SATA controller based on the Mystikal X1120231, and plug it in, you are really starting off at hour zero, and you need to do extensive testing and validation, probably for months, to see if any problems show up. FreeNAS will happily push the silicon to the breaking point during activities such as scrubs or resilvers, so you really have to do a ton of maximal load testing to gain appropriate confidence in the hardware.

Most users do not have the skills to do that, nor do they want to, which is why the LSI HBA is such a popular item. It just works, reliably, correctly, in a variety of ways, including direct and expander based SATA and SAS.

The watt burn sucks. The cooling issues in ATX chassis are infuriating. But it's the easier path, in many ways.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
This post really had me thinking about some airflow improvement I could do. Improving by channeling the airflow instead of adding extra fan. I am so pleased with the results I feel I should share it the community.

The chassis is 1U WIO with LSI 9271-4i. The idle temperature before was 62 C, and after the modification dropped to 46 C. This is 16 C difference just by adding my own air shroud !!! I can "torture" my raid card anyway I can and I am sure it won't go above 55 at any given load.

The air shroud for the CPUand RAM section (first 4 fans) is the original part, it comes with it. The one I added was the one that channels the air from last fan to raid card. I hope the pictures will show it well.

Nice job. The 2208 cards usually come with those crappy small-hole vented backplates, and this is a significant problem especially in a WIO platform.

For reference please Google image for BKT-0066L

This has been remediated on newer products with the use of square-hole brackets, which is a Supermicro hack, but unfortunately the 0066L DOES NOT FIT the older 2208 controllers, which is basically still my all-time favorite controller (the 3108's sometimes have strange driver issues).

So the point of this message:

We've been rebuilding a few dozen X9DBU/CSE113 systems here in the shop for a nonprofit, and because we're swapping inexpensive Dell H700i controllers in that were donated, we came to an unusual arrangement where we use a square-hole blank plate (high airflow) on the bottom slot, and put the H700i in the bottom slot. Because the H700i's are cable top-exit, and we're using inexpensive straight cables, there's an immediate right angle bend in the cables that serves to hold the controller in place. We place a straight-slot blank (low airflow) in the top slot, so there is low airflow across the top of the card but high across the bottom. If you want a picture I might be able to dig one up on the next build.

In any case, there's a possibility you could remove the backplate and replace it with a square blank plate. There's some risk there in that there will no longer be anything physically holding the card other than friction, so you might swap it with that other card and work out some spacers or something. I'm too lazy^Wcheap^Wbusy to go have a metal fab shop make some actual brackets up for the 2208 cards, but this whole thread is an excellent example of why I should probably start designing cooling parts again.
 

dak180

Patron
Joined
Nov 22, 2017
Messages
310
The RAID chip itself often runs much hotter, as reported by its onboard sensor, and you'll see lots of people freaking out because theirs is reporting 70'C-85'C, but this isn't really outside the realm of what seems to be reported in a lot of servers. A 9270-8i in one of our 2U's here reports at 61'C, which seems fine. The chips are fairly hardy, and seem to be able to run at high temperatures. However, many of the LSI's are known to become problematic once they're reporting as 90+'C, despite the spec sheet for some of them reporting a 115'C max temp. At those temperatures, they appear to start corrupting data. RAID controllers report problems with the member drives (probably related to that corruption), HBA's start spewing incoherent bits, appearing to read and write sectors with some corrupt bits.
@jgreco you seem to imply here that there are onboard sensor on most, if not all of the raid chips. I have a dell PERC H310 flashed to LSI IT firmware; I have not been able to discover any way of reading temperature data off this card, is there a tool that you are aware of to do this (preferably from within FreeNAS)?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
@jgreco you seem to imply here that there are onboard sensor on most, if not all of the raid chips. I have a dell PERC H310 flashed to LSI IT firmware; I have not been able to discover any way of reading temperature data off this card, is there a tool that you are aware of to do this (preferably from within FreeNAS)?

The older PCIe 2 parts such as the 2008 (your H310) and 2108 do not have a sensor. Newer high end parts such as the 2208 do, though as these lack the HBA IT capabilities, you would not use them for a FreeNAS HBA. I don't recall offhand about the 2308 (PCIe 3 HBA).

In the context of the discussion, the point was that the RoC silicon might run very warm, even hot, but some people read the spec sheet and panic because they see the ambient temp spec. The SAS2208 has an ambient temp spec of 45'C (w/ BBU/supercap) or 60'C (w/o BBU/supercap) but that's *ambient*, not the number you get from the RoC sensor. The 2208 RoC is rated for a junction temperature of 115'C though I skeptically think it might cook its little brains out by that point.

You basically have three temperatures that I'd deem "interesting", which are the environmental temperature (external air), the temperature at the heatsink, and the temperature at the silicon. If your environ is warm, then you will have trouble with overall cooling. The silicon temp needs to be kept reasonable. The heatsink temp is an indicator of all that. If you have hot silicon and a hot heatsink then you either have a hot environ or bad airflow. If you have hot silicon but a cool heatsink then you probably have a cool environ with good airflow. The RoC temperature sensor is nice on one hand but it is really only the equivalent of a car's "overheat" dummy light in that it doesn't tell you as much as would be useful.
 
Joined
Nov 11, 2014
Messages
1,174
On a side note: It's seems I am not getting these email alerts all the time(it's been over a month) , best way is to login and check manualy.:(


Nice job. The 2208 cards usually come with those crappy small-hole vented backplates, and this is a significant problem especially in a WIO platform.

For reference please Google image for BKT-0066L

I google it , it seems I already have some of these. I didn't know they had anything to do with supermicro, I found them on ebay from seller from Germany and I really like them so much so I order 10 half bracket and 10 Full. In the same 1UWIO chassis on pic,I use BKT-0066L type bracket to replace the original which was almost solid. I am talking about the 3th expansion that is behind the CPU (half length). I have nothing there so I replace to help a the CPU cooling and it really works great.(few C less in cpu temp and fans don't need to speed that fast to push the air out in stress test)


older 2208 controllers, which is basically still my all-time favorite controller (the 3108's sometimes have strange driver issues).

Probably you save me a lot of headaches and money by saying this without even knowing. I never had a 3108's or any SAS3 (12Gb) controller and was considering in my next build. I've been eyeballing it since you mention it over a year ago in you 1U X10SRW/E5-1650v3/LSI 9631. So if there is issues I rather stay away with legacy. I only wanted it because of SAS3 , so I can get 2 SSDs in mirror and get 1GB/s for my datastore , instead of doing 4xSSD in Raid10 to get the same speed.
Anyways,
the solution I am thinking about (since you cant put this BKT-0066L bracket along with LSI card) is this: Perhaps you can order a bracket from the new LSI-9631 and put it on older 2208. I am hoping it will fit. The reason is that the new honey cone type of bracket that's used on LSI9361 has much better airflow compared to the original that comes with LSI9271, right ? That could be almost as good as putting a BKT-0066L on the controller, so we won't have to make our own ghetto brackets with the power drill. And I am thinking this hone coney bracket from new LSI controllers will good for 10Gb networks card too and other hot cards as well.

We place a straight-slot blank (low airflow) in the top slot, so there is low airflow across the top of the card but high across the bottom. If you want a picture I might be able to dig one up on the next build.

That would be very helpful indeed.


Coming across this post , really got me thinking to abandon the "extra fan on heatsink" idea and help me unleash my creativity. Once I master the power of air shrouds I am unstoppable now. This picture that I share in my 1UWIO has 5 fans , instead of 6 that X10SRW allows and still cools down to 46-47 C the raid card. Also this was carefully design and tweaked that if the fan5 fails (which mostly cools the exp slots) the raid card will get some airflow from fans1-4 through a small gap I purposely left , so raid card will go not much more that 10C , so we are looking at 56-57 C. Still perfectly fine in the operating range.(tested)
 
Joined
Nov 11, 2014
Messages
1,174
The RoC temperature sensor is nice on one hand but it is really only the equivalent of a car's "overheat" dummy light in that it doesn't tell you as much as would be useful.

Not sure why you made the comparison with the car light that has only on or off state. The ROC sensor will give exact temp. So you can change brackets, change fans, change cpu load, thinker with anything to get better cooling and the ROC sensor will tell you if you made any difference and how much difference really.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Yeah I occasionally get strange hang issues with storcli on the 3108's. I do OOB monitoring on RAID storage because I find it to be more consistent and reliable, and for some reason I see the 3108's wedge on storcli sometimes, so my "fix" was to switch back to 2208's.

Not sure why you made the comparison with the car light that has only on or off state. The ROC sensor will give exact temp. So you can change brackets, change fans, change cpu load, thinker with anything to get better cooling and the ROC sensor will tell you if you made any difference and how much difference really.

It doesn't tell you the things you actually need to know, like what's actually going on. If your overheat light goes on, are you low on coolant? Have a leak? Low oil? Blocked airflow? Etc?

The positioning of the RoC sensor puts it in a similar class of usefulness. It can tell you that something's wrong, but it leaves you guessing what.
 
Joined
Nov 11, 2014
Messages
1,174
I see want you mean.

Yes, it won't tell you that there is blanket over your server , what sensor will do that anyway, right ?:smile:

At least you'll know if you make things better or worst when tinkering with the airflow, by observing the temp gage. I'll take that, compared to no ROC sensor like LSI 9261 and LSI 9211.
 
Status
Not open for further replies.
Top