New to FreeNAS - quick question regarding RAM

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I think a very simple way to discuss whether ECC is "needed" or not is your use-case.

1. If you plan to use it with data that will not be backed up and if your server was suddenly unable to read any of the data with no warning and you wouldn't care, then go with non-ECC.
2. If #1 doesn't apply for you 100%, then you should go with ECC.

The problem I regularly see is people that say they are the former, buy non-ECC RAM, and then a few months later make the decision to store their very important data on the server. Some time later they have problems that point to RAM, and they have no data and their backups were trashed due to the non-ECC RAM problems.

If you are the kind of person that can be committed to never storing important information on your server and can handle the limitations above, then go for non-ECC. But from what we've seen in the forums, everyone decided they'll store "just a little bit of important information" on the server, and later regret it.

I'd also like to mention that Supermicro's server-grade stuff generally has better compatibility than random-name-brand-desktop-motherboard. So I personally prefer to stick with the server stuff just because it generally won't cause some of the headaches that the desktop hardware can give you. I used a Gigabyte and ASUS desktop motherboards when I was doing first testing with FreeNAS. They had little quirks that were a pain in the butt. I'm glad that when I bought my Supermicro motherboards that all of the problems just went away. ;)
 
Joined
Aug 12, 2017
Messages
4
I think there are a lot of choices in building a FreeNAS system that it can get overwhelming. Here are some things to think about: NON-ECC support runs across CPU / Motherboard. So to modify a non-ECC system to an ECC system you are replacing both the CPU and the motherboard. There are some inexpensive motherboards / CPU's that support ECC.

Memory is key - 8 GB is okay - 16GB is way better for a small system (and I hope someday to have the test results to prove that).
A way to make a FreeNAS system cheaper is to not go Z2 RAID. You could go no raid at all or mirror.

Finally don't forget you'll need a backup for your media storage - so you'll want a large USB drive.

Whatever you build - I think you will find it a lot of fun. I certainly did.

Thanks,

David
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
16GB is way better for a small system (and I hope someday to have the test results to prove that).
Proving 16GB RAM is better than 8GB RAM is so dependant on use-case so I just don't see how you can prove anything unless you just took a barebones system and only installed FreeNAS and setup a minimal configuration, test it with 8GB RAM and test it with 16GB RAM. But what type of benchmarks are you going to do for testing? If you load up the system with software then there is nothing to prove, we all know more RAM is better. If you run iSCSI you should have 32GB RAM for proper performance. These things are known. So I'm not sure what you would like to prove.

But when it comes to a basic system, we recommend 8GB RAM as it will get the job done. We recommend more RAM based on the use-case. The goal is to ensure the swap space never gets used.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
why the 50GB boot SSD? Doesn't FreeNAS only boot from that drive, then run from memory?
No, it has not been that way for several years. The boot device is also the operating system device. Not much is stored there, but reliability is still important. People seem to go out of their way to buy small SSDs, but I think that's a mistake unless they are much less expensive than a consumer-grade 120G SSD with no wear on it.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I didn't find that link to have drawn any conclusion other than to try and guide people to the dark side. I think it was a bit (pun intended) funny that the author used "evil RAM" an awful lot. The fact of the matter is, if you want to reduce the chance of data corruption you should use ECC RAM over non-ECC RAM. Now days I do not force the issue if someone wants non-ECC RAM, I just point out the possible pitfall and let it go. It's not my data and I will not brow-beat a person into using ECC RAM.
 
Last edited by a moderator:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
That article is specifically addressing a scenario in which non-ECC RAM with ZFS is said to be more dangerous than non-ECC RAM with other filesystems, because a scrub with bad RAM could destroy good data on the pool. More specifically, the scenario laid out by @cyberjock here. And, as the article points out, that scenario just isn't going to happen. For non-ECC RAM to result in the corruption of previously-good data on the pool during a scrub, several things need to happen:
  • The system incorrectly reads good data from the disk that doesn't match its hash ("checksum"), due to defective RAM damaging either the hash or the data (or both) (plausible).
  • In an attempt to fix the apparently-bad data read above, the system reads parity/redundancy data, which is also read incorrectly due to the same bad RAM (plausible).
  • When reading the parity/redundancy, the system also reads its hash incorrectly (plausible).
  • The incorrectly-read data from step 2 matches the incorrectly-read hash from step 3 (astronomically unlikely).
  • The system then overwrites the apparently-bad data it initially read with the corrupted data it thinks it read at step 2
ZFS is not going to overwrite bad data with more bad data. For the "scrub of death" to actually happen, the system needs to read both parity and its hash incorrectly, and in such a way that the bad parity and the bad hash match. That just isn't going to happen (since ZFS uses a 256-bit hash, the odds of this happening are 1:2^256, or 1 in 1.16 x 10^77).

So, is ECC good? Of course it is. Should we be using it if we care about our data? Absolutely. But is ZFS without ECC more dangerous than any other filesystem without ECC? No, it isn't, and we shouldn't be trying to convince users otherwise.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
This is incorrect, let me fix what has happened to people.

  • The system correctly reads good data from the disk that doesn't match its hash ("checksum"), due to defective RAM damaging either the hash or the data (or both) (plausible).
  • In an attempt to fix the apparently-bad data read above, the system reads parity/redundancy data to fix the data blocks.
  • The parity data is used to "regenerate" the data that is allegedly corrupt, which itself is corrupt because the RAM is failing.
  • The system then overwrites the apparently-bad data it initially read with the corrupted data

BTW.. I have a test system with a bad stick of RAM (donated by a fellow FreeNAS forum user that had one fail for him and was willing to give it to me for testing), and I can trivially reproduce the "scrub of death" is with zero effort. So if you'd like a demonstration, I can definitely make that happen. :)

My system has 2x1TB drives in a mirrored configuration, with 800GB of data from /dev/random, and you can scrub it all day long with the 16GB of RAM the system has. Add in the failed stick and scrub the zpool and all hell breaks loose.

So yeah... argue all you want it doesn't exist, but I've confirmed it.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
This is incorrect, let me fix what has happened to people.

  • The system correctly reads good data from the disk that doesn't match its hash ("checksum"), due to defective RAM damaging either the hash or the data (or both) (plausible).
  • In an attempt to fix the apparently-bad data read above, the system reads parity/redundancy data to fix the data blocks.
  • The parity data is used to "regenerate" the data that is allegedly corrupt, which itself is corrupt because the RAM is failing.
  • The system then overwrites the apparently-bad data it initially read with the corrupted data

BTW.. I have a test system with a bad stick of RAM (donated by a fellow FreeNAS forum user that had one fail for him and was willing to give it to me for testing), and I can trivially reproduce the "scrub of death" is with zero effort. So if you'd like a demonstration, I can definitely make that happen. :)

My system has 2x1TB drives in a mirrored configuration, with 800GB of data from /dev/random, and you can scrub it all day long with the 16GB of RAM the system has. Add in the failed stick and scrub the zpool and all hell breaks loose.

So yeah... argue all you want it doesn't exist, but I've confirmed it.
Can you be more specific around
all hell breaks loose.

I would think the system would crash before bad data is written to disk. Do you get checksum errors? Can you validate the data on disks is incorrect with something like sha1?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I can validate that the data is incorrect with checksums. And yes, I get zfs checksum errors per zpool status. I do want to pass this along to the ZFS guys so they can attempt to code around it, but the reality of the situation is that software cannot ever be expected to resolve a hardware failure with 100% certainty.

So even if the devs can attempt to prevent the problem from getting out of control, you still shouldn't trust the software to resolve a hardware problem. :/

As for being more specific what do you want to discuss specifically?
 
Status
Not open for further replies.
Top