So this non ECC memory usage.

Status
Not open for further replies.

Ianm_ozzy

Dabbler
Joined
Mar 2, 2020
Messages
43
So I have a machine 32GB 17-3770. 5 4TB drives raid Z1
I have been using it for a few years It had freenas, then truenas.
It now has proxmox with truenas as a VM. I passthrough the onboard intel sata controller to ttuenas.
I have not lost data. I have had broken drives, so replaced.
Any memory issues have been due to lack of it, but apparently not due to using non ECC.
I left XMP turned off so the memory runs cooler and hopefully less errors.
Everything important is regularly backup up to another machine. A linux one Linux for playing media. A few old hard drives in raid Z1 is in it with samba shares.
Oh and backup weekly to external drives - 4 of them.

So I have an MSI B450 tomahawk and a 3600X CPU not being used. So it seems tempting to get some slow cheap memory 64GB - maybe tomorrow as it is black Friday.
On there will be the same system. VMs will be allocated more memory and cores. Slight changes to IOMMU module setup. With a BIOS update can boot headless it seems. No ECC memory support apparently.
So if I lose a data on a truenas , it will be annoying. I will cost me little except time. It is not tied to my livelyhood, mainly my entertainment. My main machine has important files on there. It is regularly backup up to truenas and the media machine share.

So why would I care about ECC. If I wanted it would be much more expensive and are not keen to spend much at all.

I am curious to hear from others who are using truenas without ECC, who have maybe had catastrophic issues.
I will need to hear by tomorrow of course.

I did find these videos useful.

Useful info appreciated.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

Those are old but should show the effect of non ECC RAM in a catastrophic event, as you asked. I do use ECC, and treat it as an assurance that my RAM won't cause me issues.
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680

Those are old but should show the effect of non ECC RAM in a catastrophic event, as you asked. I do use ECC, and treat it as an assurance that my RAM won't cause me issues.

It should be pointed out that the big argument for ECC is that ZFS contains no "fsck" or "chkdsk"; ZFS is entirely reliant on the correctness of the code and the ability of the pool to faithfully store and retrieve data. Once bad data is injected into a pool, hopefully it can be deleted if it is within a file, but if it is metadata then there is a good chance that there is no way to delete it, and the error is permanently introduced into the pool. The only way available to normal end users to eliminate such errors is to reload the pool from backup.

I particularly enjoyed the title of this video; "Why Scrubbing ZFS Without ECC RAM Probably Won't Corrupt Everything". When building a ZFS system, some reasonable assumptions are that you value your data and you may have quite a bit of it to store. Certainly it is true that scrubbing ZFS without ECC RAM probably won't corrupt everything. But it is also true that storing all your data on FFS, EXT3, NTFS, or BTRFS probably won't corrupt everything, yet you've selected ZFS to store your data... why, exactly?

There are four weasel words there. "Probably won't" and "corrupt everything". It is unlikely for a ZFS pool to be "corrupt everything"; there are theoretically ways for stuck bits or random flips in RAM to corrupt data, but these are only going to corrupt some subset of data. You can do damage to your data in lots of ways. For example, using an overheating HBA is a known way to encourage random data errors to be sprayed into your pool in a catastrophic and pool-destructive manner. By way of comparison, most RAM errors are mere bit flips and likely to be much less destructive. And if your bank said that they "probably wouldn't" lose your money, you'd be happy with that, right?

The big problem with ZFS is the ARC. Some drama queens have attempted to escalate the non-ECC argument for puzzling reasons. ECC used to be a fairly standard feature on computers, and it was really the introduction of the low cost PC market that did ECC in. The ARC is hazardous in that you have this cache for storage. If requested data is present in ARC, it will not be pulled from pool. The checksums are checked when data is pulled from pool and placed in the ARC. There is no further protection once the data is in ARC. If a block in ARC experiences a bitflip or corruption, the system has no way to know not to trust it. And if that block is then written back to the pool, the error becomes permanent.

It is puzzling why people are so resistant to buying ECC memory. They don't mind spending hundreds or thousands of dollars on HDD's, including raw space for mirroring or parity protection for their valuable data.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Since the OP brought reference "against" ECC, I supposed he read the resource about ECC.
 

Ianm_ozzy

Dabbler
Joined
Mar 2, 2020
Messages
43
It should be pointed out that the big argument for ECC is that ZFS contains no "fsck" or "chkdsk"; ZFS is entirely reliant on the correctness of the code and the ability of the pool to faithfully store and retrieve data. Once bad data is injected into a pool, hopefully it can be deleted if it is within a file, but if it is metadata then there is a good chance that there is no way to delete it, and the error is permanently introduced into the pool. The only way available to normal end users to eliminate such errors is to reload the pool from backup.

I particularly enjoyed the title of this video; "Why Scrubbing ZFS Without ECC RAM Probably Won't Corrupt Everything". When building a ZFS system, some reasonable assumptions are that you value your data and you may have quite a bit of it to store. Certainly it is true that scrubbing ZFS without ECC RAM probably won't corrupt everything. But it is also true that storing all your data on FFS, EXT3, NTFS, or BTRFS probably won't corrupt everything, yet you've selected ZFS to store your data... why, exactly?

There are four weasel words there. "Probably won't" and "corrupt everything". It is unlikely for a ZFS pool to be "corrupt everything"; there are theoretically ways for stuck bits or random flips in RAM to corrupt data, but these are only going to corrupt some subset of data. You can do damage to your data in lots of ways. For example, using an overheating HBA is a known way to encourage random data errors to be sprayed into your pool in a catastrophic and pool-destructive manner. By way of comparison, most RAM errors are mere bit flips and likely to be much less destructive. And if your bank said that they "probably wouldn't" lose your money, you'd be happy with that, right?

The big problem with ZFS is the ARC. Some drama queens have attempted to escalate the non-ECC argument for puzzling reasons. ECC used to be a fairly standard feature on computers, and it was really the introduction of the low cost PC market that did ECC in. The ARC is hazardous in that you have this cache for storage. If requested data is present in ARC, it will not be pulled from pool. The checksums are checked when data is pulled from pool and placed in the ARC. There is no further protection once the data is in ARC. If a block in ARC experiences a bitflip or corruption, the system has no way to know not to trust it. And if that block is then written back to the pool, the error becomes permanent.

It is puzzling why people are so resistant to buying ECC memory. They don't mind spending hundreds or thousands of dollars on HDD's, including raw space for mirroring or parity protection for their valuable data.
If it fails, it will probably lose little data. It will be annoying. My data is also stored elsewhere. I used freenas inintially as it supported virtual machines -poorly. Also other storage/nas methods are trouble when it comes ti iscsi - unless you want to mess around with very complex linux command line stuff. I already had most of the disks. I bought one, then eventually another for the storage & at least some redundancy.
So second hand motherboard/cpu/ecc memory are hard to come by. A new setup would be very expensive.
I would prefer to use ECC, but it is hard to justify the cost of a new. setup.
From time to time I do a full sync scan between backups and the NAS. It would pick it bit flips. So far so good. Freefilesync is my choice.

So still see little reason to use ECC in a home setup,especially when all I need to buy is some memory.

I am not trying to argue anything. My experience using non ECC has been fine with componants I had already.

No disaster stories have been posted yet - or nothing I cannot recover from.
 

Ianm_ozzy

Dabbler
Joined
Mar 2, 2020
Messages
43
Also:

:eek:

:eek::eek:

You are a brave soul.

it has been fine. Issues have arisen with lack of memory and disk space. Proxmox was 'corrupted and/or unresponsive or a drive was full. Truenas was fine.
Proxmox was reinstalled, VMs installed from backup (including truenas boot disk) and was fine. I need to use a machine that can handle more than 32GB RAM.
So will be going from 32GB DDR3 non ECC to 64GB DDR4 non ECC. Also some more CPU power.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If it fails, it will probably lose little data.
I honestly don't get this. Data is either valuable and thus to be preserved "at all costs", or it is not and should be deleted. There is no middle ground.
My data is also stored elsewhere.
How would you know that data is correct?
From time to time I do a full sync scan between backups and the NAS. It would pick it bit flips. So far so good. Freefilesync is my choice.
Sounds painful, to be honest. Like rsync, but more painful.
So second hand motherboard/cpu/ecc memory are hard to come by. A new setup would be very expensive.
I literally bought an X10SLM+-LN4F with Xeon CPU and 4x4GB ECC DIMMs and a Supermicro passive cooler (for use with suitable Supermicro chassis) for just over 100 bucks delivered just earlier this week. In Europe. That's insane, we never had the robust market of refurbished server stuff the US has had for a while now. I realize things are a bit trickier down under, even relative to Europe, but my point is that keeping your eyes open can be rather profitable, to paraphrase the 7th Rule of Acquisition.
No disaster stories have been posted yet.
Some of those up in post #2 might qualify. If things were bad enough for them to be commonplace, client PCs would not be running without ECC.
 

Ianm_ozzy

Dabbler
Joined
Mar 2, 2020
Messages
43
I honestly don't get this. Data is either valuable and thus to be preserved "at all costs", or it is not and should be deleted. There is no middle ground.

How would you know that data is correct?

Sounds painful, to be honest. Like rsync, but more painful.

I literally bought an X10SLM+-LN4F with Xeon CPU and 4x4GB ECC DIMMs and a Supermicro passive cooler (for use with suitable Supermicro chassis) for just over 100 bucks delivered just earlier this week. In Europe. That's insane, we never had the robust market of refurbished server stuff the US has had for a while now. I realize things are a bit trickier down under, even relative to Europe, but my point is that keeping your eyes open can be rather profitable, to paraphrase the 7th Rule of Acquisition.

Some of those up in post #2 might qualify. If things were bad enough for them to be commonplace, client PCs would not be running without ECC.

File sync tool freedilesync is of use to me. Plug into a VM (running in proxmox). Compare overnight. It will let me know if there are bitflips. Not very painful.
If it goes down, most data is elsewhere, unless between backups. My main machine maye, the 'broken' nas- the media machine with a NAS share & ZFS.
Still no convincing arguments to spend much for for an ECC setup.
What you bought is much harder to find here - in Australia. I want it alll in one machine NAS + other VMs. So 64GB+ of ECC setup you could point out to me for little money would be appreciated.
There are some old socket 2011 ones with dubious chinese 'refurbished' made motherboards available but not touching them.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
In the end if you don't care about losing data there is no argument in the world that would convince you to spend money on a data protection system.

The way I personally (and generally, not only to your case) see it is: if you use TrueNAS you care about your data, so you should use ECC; if you don't care about your data I don't understand why you use TrueNAS.

In this forum you will find a lot of users who care about their data, and as such use ECC in their systems (because data loss happened more than a few times thanks to lack of ECC); as such, if you are looking for people who don't use ECC, statistically you will probably have better luck seareching them on other places (like reddit or ltt I suppose).
We have a few people there as well, but are not that common.

Please don't read my message in an aggressive or accusing way.
 
Last edited:

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
All comes down to how valuable the data is to the admin.

New file-server oriented X10SDV-2C-7TP4F cost relatively little (relative to the storage capacity they can address). That platform offers a future-proof 10GB connection, lots of SATA ports, ECC RAM, two PCI 3.0x8 expansion slots (that can be bifurcated), etc. etc. etc. For SMB/AFP file-server SOHO use, this rig is pretty unbeatable re price/performance/efficiency.

As long as energy effieincy doesn't top your list of needs, entire pro-grade SuperMicro servers can be had used in the US for less than $500. So I find it a bit silly if folk on the one hand want to store stuff for the long term but on the other hand do not want to invest in the infrastructure to make it happen. Ditto off-site backups that are regularly checked for bit rot and rotated. Apps like Carbon Copy Cloner make this relatively easy.

Relying on the cloud also strikes me personally as dangerous since you effectively lose all control of the data.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
I honestly don't get this. Data is either valuable and thus to be preserved "at all costs", or it is not and should be deleted. There is no middle ground.

If you have one big single storage system, then you preserve it at all costs.
However, if you have proper backup, meaning dissimilar hardware/software in a different location, you don't care very much about the one system. You do care, but just not as much. The more valuable the data, the more backup copies there are, and the less you care about each individual copy.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I did not mean that in a single system, but overall. Backups in different locations should really go without saying.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Still no convincing arguments to spend much for for an ECC setup.
Then don't. But make that decision with the full understanding that you're exposing your data to an additional risk that isn't present when you're using ECC RAM. Is it a big risk? Not usually. But it's not non-existent either.

And maybe it's because you're in .au and I'm in .us, but DDR3 RDIMMs are dirt cheap here. So are the server-grade motherboards that use them--generally around 5-7 years old with lots of life still in them.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
I did not mean that in a single system, but overall. Backups in different locations should really go without saying.

If you have a backup in a different location, and your primary system dies because of not having ECC RAM, you replace the faulty non-ECC RAM and restore from the backup, no?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
and your primary system dies because of not having ECC RAM
"The system dies" isn't the primary concern of non-ECC RAM. The much bigger concern is that it lives and silently corrupts your data. The window to do this is somewhat limited--the so-called "scrub of death" is astronomically unlikely, and if you're really paranoid, you can set your pool/dataset to use one of the crypto-grade hashing algorithms to further reduce the likelihood by several orders of magnitude--but certainly there's a time when the data's only in the server's RAM. And if that RAM is bad, the system's going to write bad data (along with its corresponding checksum) to disk, and you'll never know it (and ZFS won't tell you) until you access the data again in the future and find that it's corrupt.
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
Of course, it is better to have ECC-Ram. But it is the same with other file systems.

Will ZFS and non-ECC RAM kill your data?​

=> https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/

I don’t care about your logic! I wish to appeal to authority!​

OK. “Authority” in this case doesn’t get much better than Matthew Ahrens, one of the cofounders of ZFS at Sun Microsystems and current ZFS developer at Delphix. In the comments to one of my filesystem articles on Ars Technica, Matthew said “There’s nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem.”
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Exactly. The bit rot can thus propagate throughout your backup sets and you won’t know until you try to access the data in question. Some files deal with single bit-flips fairly benignly, for other formats, the problem can be catastrophic.

I learned early on that while RAID5, etc sound good in principle, the reality is a lot messier. The beauty of TrueNAS is bringing data-farm features down to a much smaller scale, ie allowing you to store data with reasonable certainty from bit rot on desk-top-sized systems. That is a major achievement given the limitations imposed by budget, power, etc. vs. running a giant server farm w/dedicated motherboards, controlled atmosphere, and so on.

It’s also why I go through the trouble of setting up a TrueNAS system vs. simply hanging a RAID5 SATA array like the Oyen Digital Mobius 5 from a spare Mac mini. Setting up a Mac file server is still child’s play compared to a TrueNAS… BUT it will never detect bit rot until your data is down the drain.

So how important is your data to you? No judgement, this is for the admin to answer. If it’s not important, if you don’t care if you lose it, then cutting corners via the use of non-ECC RAM, non-server boards, shingled storage, no UPS, etc. might be for you. But I also don’t quite understand the use of TrueNAS under those use conditions.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Of course, it is better to have ECC-Ram. But it is the same with other file systems.

Will ZFS and non-ECC RAM kill your data?​

=> https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/
IIRC, the original ZFS implementation also had flags for different operation under NON-ECC vs ECC RAM conditions. Presumably those were implemented for a reason, ie ZFS put those flags to use to check the data more thoroughly for bit flips in RAM than it did in a ECC system.

Are those flags incorporated / used by TrueNAS today?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Status
Not open for further replies.
Top