Do I need ZFS to protect against bit rot??

Status
Not open for further replies.

Zorac

Dabbler
Joined
Nov 12, 2011
Messages
14
Total n00b question. The more I read the more I get confused. I don't need a nas, and life would be simpler without, my network is already too complicated for just my home! But if I do need ZFS, freenas is the best way to go. The question is, do I really need ZFS for protection against bit rot?

Todays hard drives with SMART are apparently pretty good at catching issues and correcting, depending on what you read, generally concensus seems to be this is suffecient for home users, and the chance of loosing data is very very low.

For what its worth, all my data resides on a single computer and only is shared to my HTPC. Its on a single NTFS dirve and is backed up once a week with robocopy to an external drive. External drive is stored in a fire/water proof safe. I use to be a wedding photographer, so between those photo, my photos of my kids and family, and documents from high school, through university to my current taxes/financials, I have a pile of important data (around 0.5TB worth). But if i have a better chance of winning the lottery than loosing data to bit rot on a hard drive, I can live with my setup.

I've been thinking of maybe doing static backups to bluray disks periodically to catch the really important stuff, which may be a better solution.

Thanks,
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Yes, you probably need something, not necessarily ZFS.

Data can corrupt, even silently, while it is stored, due to bad media. What if it happened before you backup things? You will be backing up corrupted data...
If you really care about your data you should at least store it in redundant way (raid)
 

Zorac

Dabbler
Joined
Nov 12, 2011
Messages
14
if i went that way, i would probably be using something like intel's matrix raid with raid 1. i though that only protected against hardware failure? unless your using a file system with built in checksumming, the bit rot just gets duplicated through the array. i'm not worried about hardware failure, my backups have me covered on that.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
RAID5 does not provide data protection of the sort you might be thinking. A RAID5 array is potentially capable of detecting that there is inconsistent data, but it does not do so as a matter of practice upon every read, because doing so involves reading all the disks and parity for the data in question - a performance-destroying proposition. Historically, filesystems and CPU-based resiliency strategies also did not calculate checksums automatically due to the CPU-intensive nature of those operations, so there is no middle layer offering protection to your file data, and the protection offered by the lower level is somewhat questionable.

ZFS is a game-changer in that regard because it was designed from the start to provide checksum and data protection services at the filesystem level. By allowing ZFS to manage your devices directly, rather than adding an abstraction layer like hardware RAID5 in between you and the devices (which masks from ZFS the actual situation on the disks), it becomes possible for the filesystem layer to be more intelligent about how data is stored. This may include services such as storing multiple copies of file data, automatically and transparently, and performing on-the-fly checksums of files.

Now, quite frankly, hardware has gotten lots better over the years. Even 20 years ago, it was a little unusual for quality hardware to produce an error without detecting it in some manner. Nowadays, everything is "smart" and has a CPU, complicated protocols, and error detection. However, there's also a drive towards cheaper hardware. Going back that same 20 years, a quality hard drive might have cost $1500... and even a cheap hard drive was hundreds of dollars.

Just a few months ago, we were seeing 3TB HDD's for $109. Think about it.

For those of us who have habitually spent extra cash to make sure we're getting good hardware and taking extra steps to protect our data, I think the value-add of ZFS checksums is almost zero. Errors in data handling in a cheap system, which causes BSOD under Windows, bit errors in non-ECC memory, cheap knock-off controllers made in China that are made to "just barely works" standards, yes, these things are potentially destructive to your data. But I haven't seen a disk corruption on any of our systems in many years that wasn't readily identifiable as a bad disk block by the hardware... and we do have a fair bit of stuff where bit rot would be detected and would hurt if it happened. So if you're concerned about bit rot, my suggestion is to buy quality hardware. Then run ZFS on top of it. It's very belt-and-suspenders. If your data is important to you, ZFS can be part of a resilient storage system. Sun designed it with the future in mind, and I think they did a grand job. I wouldn't trust any irreplaceable data even to ZFS on a server-grade platform, though... you need backups (or replication) too.

Now, for the original poster's question. In this age, you can go in two directions. One is to store copies of data on hard drives, which is relatively cheap and effective. If it's important, buying a few 750GB hard drives, making multiple copies, and running md5 on all the files to make sure you have verified the file contents is very comforting. For smaller amounts of data, I strongly prefer SD cards and Taiyo Yuden DVD's (4GB/ea), they do make some well-regarded ones that should keep your data readable for the foreseeable future. Making archival dumps onto the DVD's several times a year allows multiple copies of older data to happen magically.

http://www.imagescience.com.au/kb/q...on+to+Taiyo+Yuden+Master+Grade+Archival+Media

As hard drive capacities continue to explode, the practicality of DVD-based backups will continue to diminish, of course, but it's still a good option for documents and stuff like that as the technology should be good for at least another decade.
 

Zorac

Dabbler
Joined
Nov 12, 2011
Messages
14
thank you, thats the best explanation ive read yet! reading white papers from oracle doesn't really give it to you in user fiendly terms.

come to think of it, the only BSOD ive had in recent years was always due to video driver issues. although i very rarely come accross a corupt file that could be bit rot, there usually is something else at play (ie a network transfer) and when your paying $100/yr in power to run a server for a year, you have to factor that in too. WAF is always a consideration too! (apparently i already have to many computer?!?!?)

i use to be pretty good at burning all my important stuff, but havent done that since i started running an external HD for backup. might be a good idea to start burning backups again and forgo the NAS for now. the capacity of bluray disks make it easy to do. even the cheapest media lasts 5 years under less than ideal storage conditions, and by then ill probalby be looking at a different solution, burning the really important stuff on an annual basis would ensure changes are captures and avoid any issues with bitrot on the bluray media. at the end of the day its about mitigating the risks as best you can and trying to keep it simple as possible.
 
Status
Not open for further replies.
Top