FreeNAS without ECC RAM

tomsk

Cadet
Joined
Feb 8, 2019
Messages
8
Hello,

I know that FreeNAS uses ZFS and ZFS requires ECC RAM, because of checksums and file validating.

But I read from here https://arstechnica.com/civis/viewtopic.php?f=2&t=1235679&p=26303271#p26303271 that according to Matthew Ahrens, who is ZFS creator, he said:
There's nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem.

And he said, that we can use ZFS_DEBUG_MODIFY flag (zfs_flags=0x10), so it should check data in memory before wiring to disc, so does it mean, that it could work like Btrfs? As far as I know Btrfs does have self-healing feature like ZFS does, but it doesn't keep in RAM, so if you use Btrfs you don't have to have ECC RAM, but there is no Btrfs for BSD systems, so I wonder, if I can live without ECC RAM in FreeNAS.

Thank you
 

ascl

Dabbler
Joined
Jan 30, 2019
Messages
26
Ultimately, any system that handles your data without ECC RAM, can corrupt your data. This is true of btrfs as well as any other process... if data gets corrupted in memory and is destined for a disk, it will get written in corrupted form. Software generally assumes that RAM is stable, it more or less has to given the software also lives in RAM.

So, yes, of course you can live without ECC RAM... most of us probably have desktops without ECC RAM, and they generally work. That said, I chose to spend the extra money for ECC RAM on my NAS, because music files I have had stored for years on my QNAP with RAID 6, I found a few that have been corrupted. Some of these I first ripped more than 10 years ago, and have moved over a few storage solutions. Was this the result of a lack of ECC RAM? I don't know, but to make it worse, I have an offline backup, but it also has these files as corrupted. So for me, ECC (and ZFS, yes I'm late to the party) was worth while. I value (at least some of) the data I store. Some of it is either very difficult or impossible to replace.

Also, this seems to be loaded question around here.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Sometimes people miss the bigger picture.

ZFS has no reasonable tools or capabilities to recover from corruption of metadata that has a valid checksum. I've seen people here lose pools because of power outages. What kind of lameness is that?! (I'm smart enough to know why, don't waste your time lecturing me. The proof of the pudding is in the tasting.) ZFS does have higher demands than other file systems, unless you consider 100% pool loss a reasonable equivalent to CHKDSK. ZFS may have been designed to handle unreliable disks, but it was targeted towards very reliable Sun computing hardware.

Once your ZFS metadata corrupts, you are S-O-L. Run ZFS on ECC, or keep very good backups.(Of course, everybody has perfect backups!)
 
Last edited:

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
ZFS requires ECC RAM
I know @ascl covered this above, but I want to reiterate that this notion is completely false. ZFS most certainly does not require ECC. All "checksumming" of data blocks and other things are done, of course, in the software.

There are some edge cases where people have claimed that having non-ECC RAM can make a bad situation worse in certain ZFS instances, but Matt Ahrens, and basically everyone else in the development side of ZFS, has gone on record saying they think that is the feces of the bull.

Bottom line: most of us use ECC RAM, simply because FreeNAS is a solution for those who want to do it right. And any kind of server, particularly a file server, should be done right. We bought our hardware specifically to make a FreeNAS. When you do that, it only costs a little more to get the ECC stuff. It's certainly not REQUIRED.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630

millst

Contributor
Joined
Feb 2, 2015
Messages
141
https://research.cs.wisc.edu/wind/Publications/zfs-corruption-fast10.pdf

This is the most scientific information I've seen on the subject. Basically, ZFS offers no special protections against corruption in memory (just like other filesystems).

I'd kind of echo what DrKK said. Presumably, you chose FreeNAS/ZFS because you care about data integrity. If that's true, why wouldn't you spend the extra few bucks to maintain that integrity in memory?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
There's nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem.

I think the interesting thing about this statement that most people miss are the last six words - and/or they probably don't interpret those words correctly.

When you skip the ECC, whether you're running NTFS, UFS, EXT3, or ZFS, your platform loses the ability to detect corruption in-core. This is bad-ish, but really only if you have an undetected bit flip. It's like riding in a car without a seat belt. It's fine right up until you get into an accident. You don't notice otherwise.

Now, unlike NTFS, UFS, or EXT3, a significant issue with ZFS is that it lacks any sort of filesystem recovery tool to correct corrupted metadata. The available remediations basically boil down to "live with it" or "rebuild the pool," both of which are kind of miserable in various ways. It's definitely possible a corruption won't have an operational impact. But the big thing is that you are likely to not even realize it's there, because ZFS can't detect it.

So, from the point of view of someone who wants Mr. Ahrens to be blessing their non-ECC system, they hear "oh yeah it's fine, go ahead, no problem."

A more pragmatic review suggests "it's fine, the error rates are equivalent to other filesystems." That's strictly true, but somewhat disingenuous, because ZFS lacks filesystem repair tools. That means that if something goes wrong you might need to rebuild the pool, and that may mean evacuating the data, and if you've got a 96TB pool (a mere 12x8TB HDD), finding a place to store all that data while you rebuild the pool is a real PITA.

That's an operational situation I never want to have to face, and running ECC RAM is a modest expense in comparison. Besides, older DDR3 gear is dirt cheap and I can source used ECC RDIMM and CPU at a price that makes it stupid to buy new non-ECC stuff.

If you don't care about your data, non-ECC is fine. If you really need data integrity and you have a massive pool, ECC is really a great idea. Everthing else is shades in between.
 

ascl

Dabbler
Joined
Jan 30, 2019
Messages
26
Now, unlike NTFS, UFS, or EXT3, a significant issue with ZFS is that it lacks any sort of filesystem recovery tool to correct corrupted metadata. The available remediations basically boil down to "live with it" or "rebuild the pool," both of which are kind of miserable in various ways. It's definitely possible a corruption won't have an operational impact. But the big thing is that you are likely to not even realize it's there, because ZFS can't detect it.

This is a really good point, and while I had seen this stated, it never really occurred to me what the ramifications of this is. How many times have you (any of us) run fsck or chkdsk or whatever? Not often, but far more often than I'd want to restore from backup!

Also, the linked whitepaper is pretty interesting, and concludes with:
In summary, so far we have studied two extremes:ZFS, a complex filesystem with many techniques tomaintain on-disk data integrity, and ext2, a simplerfilesystem with few mechanisms to provide extra relia-bility. Both are vulnerable to memory corruptions. Itseems that regardless of the complexity of the file sys-tem and the amount of machinery used to protect againstdisk corruptions, memory corruptions are still a problem

They do go on to describe some programming approaches that could reduce the probability of failure due to bit flips, which I found quite interesting.
 
Top