Proactive hardware replacement?

Status
Not open for further replies.

patrick sullivan

Contributor
Joined
Jun 8, 2014
Messages
117
This has been asked before, but in older threads (at least from what I have found). Is it worthwhile to replace any hardware before it starts to break down? ie power supply, hard drives...? My system has been running for 3 years 24/7.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
The simple answer is No.

The more informative answer is: If you treat this as a high value data server then you will have the proper RAIDZ level and have critical data backed up. If the data needs to be available 24/7 then you would also have a spare hard drive on hand to resilver when a hard drive starts to show errors and you would replace the drive as soon as you can. Ensure you are running SMART Long tests frequently to catch those hard drive issues early. It's better to replace a failing hard drive on your schedule vice once it dies hard.

A properly designed system while maintianing good power and good cooling to your system there is no reason to think it wouldn't last a decade or longer. The exceptions are the hard drives of course, plan for those to fail.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
So while @joeschmuck is certainly correct from the intellectual/truth standpoint, I myself sometimes find it necessary to think with the heart, in addition to thinking with the brain.

I, personally, must feel like I can "trust" all of my equipment. Of course, I have the right RAID levels, the right equipment, God knows the right level of skill to properly maintain it. I even have cold spare new virgin drives on hand, should one of them fail. But what I expect is for the system to WORK, I don't EXPECT to ever need to replace anything, I have never in my life, actually, needed to bring down one of cold spares, knock on wood. When the components are aged to a point where I am now starting to expect a failure "any day now", then that's no longer where I want to be.

When I have drives getting into 30000 hours of rust-spinning time, then it's simply time. And I replace the whole NAS. This is what I've always done, at least since I got my first bigboy job after grad school. I buy for tech for today (not tomorrow), and plan to use important things (like a NAS) for 3-4 years, and unimportant things until they break or become too dated to use. I don't even attempt to "future-proof" a system (to me, that's a mark of an idiot, justifying a very pedestrian and immature "E-{genitals}" as if it were wisdom). And then when the equipment is getting elderly, I replace it, or demote it to some kind of emergency backup system or whatever, or give it to a grad student who can't afford the computers I take out of petty cash (I was once in that spot myself, and would have liked to have the local DrKK giving me his "old" computers). I certainly don't wait for things to break, these days.

So there are at least two schools of thought on this. Most sensible people, like @joeschmuck, simply say: "Know what you're doing, buy proper equipment, and be ready to replace failing equipment, and your stuff can last a decade or more!" Less sensible people like me say: "Proactively replace your platforms and devices as the equipment starts getting into elderliness---after all, it's really not that expensive, and I hate surprises, like waking up in the morning---in a hotel room in a different continent, because Murphy's Law---with failing hard drives".
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
First of all I would not classify @DrKK as less sensible, different perspective sure.

With that said, I hope this was a typo or maybe this is some kind of IT point of view:
When I have drives getting into 30000 hours of rust-spinning time, then it's simply time. And I replace the whole NAS.
My drives have over 40,350 hours on them and the system is not that old. While I do expect the drives to make it to the 5 year point, one never really knows when they will die. I would expect to replace my drives one more time and after 10 years is up, likely replace the entire system with one which is more efficient. By then a 6TB SSD might cost only $300 (It's a pipe dream but it's my pipe dream).

My personal plans are once I have a single failing hard drive I have every intention to replace it with a same size drive spare I have on hand and then start purchasing replacement hard drives for the entire pool. I have well exceeded the warranty period of my drives and I do expect them to fail and when one fails, I expect the others to fall in line and fail soon after. They could last another year but that is playing the odds game and as @DrKK said, Murphy's Law and I too would be on travel when something bad happens. Not much you can do about it when you are not in physical proximity to the machine except maybe shut it down and fix it once you return.
 

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
In a production setting with well over 150 drives spinning, I have noticed that you neither put brand new hardware in mission critical production nor have very old hardware run it. Once the majority of things has 'burnt in' for a month or so you can start using it and then once the majority of warranties have expired (usually 3-5 years down the road) it's well time to move the system into a backup mode.

Both the new hardware cost, speed, size and technologies will have changed enough to justify such expense. But our now-tertiary backup is nearly a decade old and still running, the hard drives have been replaced because of capacity issues but most drives will last 3-5 years before they're too small, I still have workstations and smaller servers with 10yo drives although they don't store any critical information and are properly backed up.

As far as power supplies and co, I've had all sorts of things fail but they're hard to predict, most solid-state components and even fans will last for well over a decade. I have to keep an Apple Xserve G5 and XRAID alive for convoluted reasons, I even have a Sun SPARCStation from the 90s that compiles stuff for an equally old MRI system. Never any failures, not even disk failures.

You can never trust a drive though, it will fail, sooner or later. I had purchased 12 of the Seagate 3TB SATA models at one point, big mistake as they all failed in a matter of months. Luckily it was only a backup system because I had weekends where 5 drives simultaneously failed. It was eventually found that the firmware was to blame where a drive would lock up if a SMART check ran at the same time as a read/write but in between that time, I had to take them all out of production again.
 
Last edited:

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
With that said, I hope this was a typo or maybe this is some kind of IT point of view:
Not a typo. I consider somewhere in the 30k-40k range to be the AARP card for normal, consumer, hard drives. After all, why do you think the RMA window closes at about that point? hmmm?? :) I freely admit that this is a very hard core, minority view, and it's not for everyone. Just as lots of people continue to be very productive and useful and can do some of their best work after they qualify for AARP membership, so too can hard drives. You just decide what your money-to-inconvenience ratio is, and you decide what your risk-to-serenity ratio is. Mine are, respectively, very high, and very low. As a grad student making $14400 per year in 1999, though, I would have sung a different song.
 

patrick sullivan

Contributor
Joined
Jun 8, 2014
Messages
117
Excellent discussion. Thank you all for your input!

Cheers!
 
Status
Not open for further replies.
Top