I have a 3-year-old FreeNAS, never had a drive fail, should I be getting nervous?

AllanB

Dabbler
Joined
Feb 6, 2017
Messages
11
It's a 12 disk RAIDZ3; the disks are WD Red 8TB, model WD80EFRX. Obviously these drives can't last forever. RAIDZ3 can survive a lot of damage, but if, for example, two drives failed in quick succession and another two failed while resilvering that would do it in. Back when I had an MDADM RAID I had two drives fail within a couple days of each other. This system gets a monthly scrub, which is good for peace of mind but must wear the drives.

So, I'm wondering if others have systems for rotating in newer disks even when failures are not happening, etc. It's a nice problem to have - I don't really like switching out disks. But losing the whole thing would be a pain. Any thoughts? Thanks.

(I see this has been discussed before, but most recent seems to be 2017 - https://www.ixsystems.com/community/threads/proactive-hardware-replacement.55665/ )
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
That's what burned in cold spares are for. I wouldn't be too concerned at 3 years as long as you have regular smart tests scheduled.
 

Herr_Merlin

Patron
Joined
Oct 25, 2019
Messages
200
I am running some NAS and SAN and or HW RAID1 disk since more than 5 years constantly 24/7 except one datacenter move.. non of those failed.
Most disk failing I've seen was with an old EMC CX-3 with 60 spinning FC disk after close to a decade of runtime.. but those where still ignore able.. compared to the age, redundancy, cold spares, hot spares and runtime and how often it had been moved.
 

AllanB

Dabbler
Joined
Feb 6, 2017
Messages
11
SMART tests run once a month, but I have no idea where the results are logged. I assume they'd get emailed to me if something comes up bad? I do get daily emails of load summaries..
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
456
SMART tests run once a month, but I have no idea where the results are logged. I assume they'd get emailed to me if something comes up bad? I do get daily emails of load summaries..
if there's an error when you log into the webui you will have an alert in the upper right as well..:)
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
In my home NAS, I had one drive fail after five years. By then, my storage was at 80% capacity so I decided to upgrade the entire pool. I replaced the bad disk with a larger disk, then rotated in larger disks one at a time until the entire pool was upgraded.

As @Jailer commented, I would not be too worried after three years, but definitely keep an eye on it. And you do have your data backed up. Right? RaidZ is not a substitute for a proper backup strategy.
 

AllanB

Dabbler
Joined
Feb 6, 2017
Messages
11
The data is backed up, and the most important files also have offsite copies. The local backups are just NTFS drives though, they don't have redundancy or the ability to detect silent errors like ZFS does.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
RAID is like car insurance.... with a pit crew. You may not need it very often, but you can't afford to lose your data. Disk failure rates are getting close to 1% per year. So with 12 drives, zero or one failure in 5 years would be pretty normal. Beyond 5 years, the numbers get worse.
 
Joined
Jan 4, 2014
Messages
1,644
So, I'm wondering if others have systems for rotating in newer disks even when failures are not happening, etc. It's a nice problem to have - I don't really like switching out disks. But losing the whole thing would be a pain. Any thoughts?
The data is backed up, and the most important files also have offsite copies. The local backups are just NTFS drives though, they don't have redundancy or the ability to detect silent errors like ZFS does.
Personally, rotating disks is like rotating tyres on a car for me. It's a strategy that might yield some benefits. However, there's still a single point of failure...the car! Think of your pool as the car. Have you considered building a second server and setting up replication? It can accommodate all your older disks and doesn't need to be as powerful as your primary server. Have a triple disk failure on one server and you lose the pool on that server. It's highly unlikely you lose the pool on the second server at the same time. With this arrangement and an effective 3-2-1 strategy, you can pretty much forget the pain-in-the-butt local backups that you're doing.

Though frowned upon in some circles, I run RAIDZ1 and use mirrored USB boot sticks, and have done so for years. I sleep easy at night because every dataset is replicated onsite, and key datasets replicated offsite as well. Disks on secondary servers are starting to bump into 7 years of service running 24/7. No biggie. They're on secondary servers.
 
Last edited:
Top