Norco RPC-4220 Fan suggestions

Status
Not open for further replies.

azathot

Dabbler
Joined
Jan 13, 2015
Messages
10
Howdy folks,

I have a Norco RPC-4220 with 20 drives (10 HGST Deskstar NAS 3.5-Inch 4TB 7200RPM and 10 Western Digital WD40EFRX 4 TB WD Red) and I am encountering some heat issues.

There are four 80mm fans directly to the rear of the drive backplane and 2-80mm fans on the back of the case. These are all stock.

So you have any suggestions for improving the air flow? Right now, after a few days, the pool goes to a degraded state from errors, however, if I shut the NAS box off for 10 minutes or so, these errors go away.

I'm not sure what the current suggestion is, do I swap out the black plane for 12omm fans? Do I upgrade the 80mm fans with something more formidable? I'm open to any suggestions.

Thanks!
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Howdy folks,

I have a Norco RPC-4220 with 20 drives (10 HGST Deskstar NAS 3.5-Inch 4TB 7200RPM and 10 Western Digital WD40EFRX 4 TB WD Red) and I am encountering some heat issues.

There are four 80mm fans directly to the rear of the drive backplane and 2-80mm fans on the back of the case. These are all stock.

So you have any suggestions for improving the air flow? Right now, after a few days, the pool goes to a degraded state from errors, however, if I shut the NAS box off for 10 minutes or so, these errors go away.

I'm not sure what the current suggestion is, do I swap out the black plane for 12omm fans? Do I upgrade the 80mm fans with something more formidable? I'm open to any suggestions.

Thanks!
May I ask sir, what is the precise thermal situation? Exactly how hot do the drives get before they start throwing errors? If drives are getting hot enough that they start throwing errors, then the situation is usually quite out of hand. If you look at smartctl -x for each of the drives, you should see a number of statistics that show lifetime and current power cycle temperature ranges. I'd like to see those out of curiosity, and so that they are written down here for the next guy with the same situation.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Just for fun, I bet on 55 C...
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Any drive that got to 55C, I would throw out.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep, if you're getting errors because of the temp then the drive is pretty much cooked at this point because the temp needs to be really high to have errors, it's not like if 45 °C can do this, it should still be usable the time to do a backup though.
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
Any drive that got to 55C, I would throw out.

That silly... Why would WD rate their Red disks for 70C then?

So you know more about HDD performance and acceptable operating specs than the engineers at WD?

I had a cooling failure in one of my servers and my disks got to ~60c for about an hour. This was well over year ago and none of the 12 disks in said system have shown any errors or issues because of this and are still running today.

So I guess if you like to waste thousands of dollars you can just throw them out...
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I don't know why they rate them so high but I can guaranty you that the drive will not last long at 70 °C. I think they use accelerated aging and extrapolate the lifetime from that but it's not the same as a proper life test (of course to do this you need to wait a few years for the drive to fail so that's why they don't use that...).

1 hour isn't that long but a few days or more at this kind of temp is really pushing the drive...

I think what I would do is to replace them (but let one or two if I use RAID-Z2 or 3) and keep them as spares.
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
Yes, high temperatures like this are obviously not doing the drive any favors and should be avoided in all cases. However I think it would be silly to just throw away a drive that happened to get this hot briefly, especially when it's well within operating specifications.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Yes, high temperatures like this are obviously not doing the drive any favors and should be avoided in all cases. However I think it would be silly to just throw away a drive that happened to get this hot briefly, especially when it's well within operating specifications.
That's all well and good sir. But I have a few points, which everyone is free to take or leave as they see fit:

  • If I had any high density platter drive get over 60C, I would throw it away, or relegate it to emergency cold spare status only.
  • If I had any high density platter drive get well over 50C, I would probably replace it, and take the other drive out of service and relegate it as a spare.
  • But none of these would ever happen, because my FreeNAS is properly configured, and would instantly email me LOOOOOOOOOOOOOOONG before either of these things would occur, in which case I would either respond on site, or shut down the server pending my arrival on site.
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
  • But none of these would ever happen, because my FreeNAS is properly configured, and would instantly email me LOOOOOOOOOOOOOOONG before either of these things would occur, in which case I would either respond on site, or shut down the server pending my arrival on site.

You are clearly very foolish if you think that being "properly configured" and having some email notification set up is enough to prevent things like this from "ever happening".

Of course I have email notifications set up as well as other forms of monitoring. I've been running servers and storing lots of data for more than a decade. And I must be doing something right since I've never lost a single bit of data in my life yet.

You can have all the external notifications you want set up, but that doesn't help you when your Internet connection goes out and your cooling system fails while the Internet is out, which is what happened in my case. I even have external watchdogs set up to notify me when my server loses Internet connection so I actually knew the Internet was out, but if I am away there is no way for me to do anything about it like shut down the server or view any of its statuses with no Internet connection to it. I had no reason to believe that it was overheating. It hadn't overheated ever in its entire life before then.

Since then I have added a process that will shut down the server locally if disks get too hot, and that is clearly an improvement, but still nothing is going to be perfect and things could still go wrong. Anyone who has never run into unexpected software bugs has simply never used computer software enough or are too naive if they think their system is perfect and nothing will ever go wrong.

I'm not about to replace over $2000 in HDDs in my home file server just because they got a little warm for an hour or so. Clearly it was not necessary as I have not seen a single issue with any of the disks in over a year since the incident. I think I'm going to trust WD's engineers on this one.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
"but that doesn't help you when your Internet connection goes out and your cooling system fails while the Internet is out" that's why I designed a hardware fan controller who shutdown the server if a fan fails. I wonder why this kind of thing isn't already integrated in the MB and/or chassis (yes, I know there's IPMI but it's software). I think a few $ of electronic is well spend if it can save hundreds or even thousands of $ of drives (and the data...) :)
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
You are clearly very foolish if you think that being "properly configured" and having some email notification set up is enough to prevent things like this from "ever happening".

Of course I have email notifications set up as well as other forms of monitoring. I've been running servers and storing lots of data for more than a decade. And I must be doing something right since I've never lost a single bit of data in my life yet.

You can have all the external notifications you want set up, but that doesn't help you when your Internet connection goes out and your cooling system fails while the Internet is out, which is what happened in my case. I even have external watchdogs set up to notify me when my server loses Internet connection so I actually knew the Internet was out, but if I am away there is no way for me to do anything about it like shut down the server or view any of its statuses with no Internet connection to it. I had no reason to believe that it was overheating. It hadn't overheated ever in its entire life before then.

Since then I have added a process that will shut down the server locally if disks get too hot, and that is clearly an improvement, but still nothing is going to be perfect and things could still go wrong. Anyone who has never run into unexpected software bugs has simply never used computer software enough or are too naive if they think their system is perfect and nothing will ever go wrong.

I'm not about to replace over $2000 in HDDs in my home file server just because they got a little warm for an hour or so. Clearly it was not necessary as I have not seen a single issue with any of the disks in over a year since the incident. I think I'm going to trust WD's engineers on this one.

You win the Murphy Award of the year. You just gave me an incentive to go forward with my dedicated fan controller system. It seems FreeNAS 10 will have infrastructure to communicate regarding unusual events such as this, so I'm hoping for some neat integration.
 
Status
Not open for further replies.
Top