ECC vs non-ECC RAM and ZFS

Status
Not open for further replies.

vegaman

Explorer
Joined
Sep 25, 2013
Messages
58
yes, that I agree with... problem is with electrical fault, it's usually a whole row that will be faulty, and you can't recover from that. It's extremely rare to have a single bit permanently "stuck".
If that's the case; you can't predict the behaviour of your system: it will likely crash at some point
With non-ECC RAM yes. But with ECC you don't have to wait until there is, by chance, some critical data to the system in the faulty area of RAM. It will be detected anyway - even if it can't be fixed.

The only way I can make sense of what you're saying is if I assume you are saying that ECC itself is faulty in this 'bad RAM'. I don't know enough to say conclusively, but from what I've learned I don't believe this is possible (bar the ridiculously small chance previously mentioned by cyberjock).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Here's my attempt to explain how it works:

Remember the Pythagorean theorem from grade school/high school? A^2 + B^2 = C^2

So think of it like this. A & B are your RAM locations, C is your ECC portion of RAM. Now, using the actual data stored you calculated an arbitrary value. So if A or B is wrong, then you can recalculate it. it's simple algebra to solve for any misisng letter.

Inside the computer all that happens is that when RAM is read it goes through the memory controller(for ECC RAM you get 72 bits instead of 64-bits despite 64 bits always being requested). Even if you request only 1 single bit, you will still end up retrieving 72-bits from RAM as the ECC check is part of the pipeline to the CPU. It compares A, B, and C. If all is well, then A and B go through to the CPU for processing/execution. C isn't important for your program to run.

Now let say that A is wrong. Then it calculates for the correct A, writes that to RAM and then sends the correct A & B to your CPU(normally it goes to L2 cache). I've been a little foggy on this and not gotten what I consider 100% proof of this, but the memory controller then reads back the location it wrote to to verify it was written correctly. If it hasn't you get your BIOS error log entry that RAM is bad. This prevents you from getting error messages from radiation, etc. After all, since all RAM has random errors from radiation you don't want a log filling up with those kinds of errors since you can't really fix them. You just want errors from things you can control like stuck bits so you can RMA that memory stick, right?

Now what if A & B are wrong. Now you have a problem. If you remember from math class you can only solve for equations with 1 unknown variable. You now have 2; A & B. So you get a system halt and an error message in your system log.

So as you can see, that ECC stuff is actually pretty cool and very helpful. You're protected from trashing any data on any disks because the corrections are made as the system runs through its normal routine. And if the memory controller hits a situation it can't get out of, the system halts.

It's not the greatest example, but it has the virtue of being extremely simple to understand. So easy a high schooler will get it.

Now, to see how the real ECC RAM works its a bit more complicated. It doesn't do the Pythagorean theorem because 1/3 of your information is not actual code. You only got to store 2 pieces of data out of 3. That's kind of a bad return on investment for RAM. Good RAM is not that expensive. We also need to do this on a scale that protects large amounts of RAM while not requiring large amounts of RAM to correct the errors. par/par2s use something called Reed-Solomon error correction(R/S). An alternative to R/S is XOR. The link explains it well enough that I won't explain it here. (If you check out that link, you'll see parity is mentioned. Anyone remember "parity RAM" from way back in the day?)

Anyway, that's how ECC does its calculation. With given inputs you will get a given output. With a certain amount of bad input you can identify and fix them, but beyond that you cannot fix them but you can identify that "something is wrong"(this is what I tried to explain above). It sounds pretty familiar to Par2s.

Reed-Solomon error correction is pretty awesome. It's been used tons of devices like CD/DVD/Blu-Ray checksums, some transmission protocols, SSDs use it for checksums on memory pages, and in probably every RAID controller you have ever owned that did RAID5 and RAID6(bet you wondered how it came up with that "parity" data huh? now you know...). It's used in tons of places to protect information because it is so robust.

So what's the difference between ECC and non-ECC RAM physically/electrically? Literally, you have 8 more bits. Typically, RAM has 2, 4, or 8 chips on it. In the case of 2 or 4 there will be a smaller extra chip that handles the extra 8 bits. But for 8 chip RAM sticks, some are special. Some manufacturers make only ECC PCBs. This helps with production costs as they have to make only 1 PCB to cover their entire line. Then, when they decide to make a RAM sick they decide to either go with 8 chips(non-ECC RAM) or 9 chips(ECC RAM). All of the chips are identical in 8 bit increments(8bits per chip * 8 chips= 64 bits; 8 bits per chip * 9 chips = 72 bits). That's it! (Don't confuse this with registered RAM that has to mitigate capacitance from high density RAM). Here's a picture comparing ECC and non-ECC RAM. Notice the number of chips.

Check out the below picture:

RAM.jpg


See the white square? That's where the 9th memory chip would go. So I know without looking up this RAM model it is definitely non-ECC RAM. ECC RAM will always have 3, 5 or 9 ram chips(basically odd number of chips). But don't confuse the ram chips with the registered RAM as those have extra chips to deal with capacitance from high density DIMMs.


So now you are wondering why ECC RAM is so much more expensive than non-ECC RAM. After all, they're only adding 1/8th of the cost(minus the practically nil cost of the PCB). Simple answer.. price gouging. They know you'll pay for it because they know you want it. It's as simple as that. The reality of it is that ECC RAM, if it weren't for price gouging, shouldn't be more than 1/8th more than non-ECC RAM. (I tried to find a picture, but I couldn't find one). Of course, some manufacturers do heat stress testing and heat accelerated aging on ECC RAM to get past the early failure known as the bathtub curve.

Now I'll mention registered RAM only briefly because I mentioned it above. RAM in very high densities causes alot of capacitance. You've seen capacitance if you've ever touched a doorknob and gotten shocked. And just like how it hurts you, it can kill RAM. Since RAM really is nothing more than a bunch of microscopic capacitors the more you have in parallel the more it hurts. And to prevent one stick of RAM from damaging the other stick, they are electrically isolated. They usually have a chip(s) that look different from the chips that actually store data. Here's a picture:

registered.jpg


Notice the 9 chips(so this is ECC RAM) but see the 2 smaller chips in the middle. That's your giveaway that this is Registered RAM.

So lets apply this new found knowledge... Go to this Amazon page and look at that. It's ECC RAM, but has only 8 chips. The seller is selling it as ECC Registered RAM. Guess what? You and I both know that the picture is not of ECC RAM nor is it registered.

Check out this Kingston RAM. One picture shows what is definitely ECC + Registered RAM, but the other picture is clearly non-ECC unregistered RAM. Again, our secret...

Here's a test. Look at this stick...is it ECC and is it registered? Look up the model number on Kingston's website for the answer...
106160.jpg



So now, at a glance, you can look at a stick of RAM and without even looking up the model you should be able to identify ECC from non-ECC and registered from non-registered.
 

Attachments

  • the_more_you_know.png
    the_more_you_know.png
    851.1 KB · Views: 878

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
yes, that I agree with... problem is with electrical fault, it's usually a whole row that will be faulty, and you can't recover from that. It's extremely rare to have a single bit permanently "stuck".
If that's the case; you can't predict the behaviour of your system: it will likely crash at some point

Yep, and you know what happens when a whole row fails? Your system halts. ZFS pool is protected! That's all that matters, right?

that would cause a "soft" error, it's not permanent. this is not faulty, it's within operating errors and that can be recovered / detected.

Having said that, wikipedia mentions that with modern RAM, cosmic radiation has virtually no effect.

Correct, but I don't deal with cosmic radiation... (worked at a nuclear power plant). ;) That's why I know so much about this junk.

It can be corrected, but only in certain circumstances. If you read up on soft errors here you will see that alpha particles can cause a bit flip. That can't be corrected because the system cannot identify the flipped bit. So no, not all errors can be corrected. And again, flipped bits don't bother me as much as pool killing stuck bits. But both stuck bits and flipped bits can be corrected with ECC RAM assuming you don't have multiple. Soft RAM errors such as those from interference from adjacent components causing the signal to be lost(or something such as a voltage spike that exceeds the RAM voltage so the controller knows the signal should be ignored) from RAM to the memory controller resulting in the RAM location being queried again are self-repairing because it was obvious that the signal didn't make it to destination. But, that's not a RAM failure per se. It could be. But again, I'm not worried about a flipped bit here and there as much as I am about stuck bits killing pools. Those are just horrible to whomever it happens to. It's like a stick of vitriol in your computer just waiting. If nothing happens, then great. You got lucky and nothing bad happened. If something happens, you won't know until the fun is over and the crying has begun.
 

jyavenard

Patron
Joined
Oct 16, 2013
Messages
361
The only way I can make sense of what you're saying is if I assume you are saying that ECC itself is faulty in this 'bad RAM'. I don't know enough to say conclusively, but from what I've learned I don't believe this is possible (bar the ridiculously small chance previously mentioned by cyberjock).


this is exactly what I've been trying to say!!! Yes: bad ECC meaning "the ECC module is faulty" (module as in M in DIMM, seems I have to argue every word used here). Anything can have a manufacturing fault... ECC RAM is no exempt of it.

Instead I get a lesson about what ECC like I'm so kind of teenager, which has ZERO to do with the point I tried to raise...

Sounds more like gloating than anything else... In French there's a common say: "Knowledge is like jam, the less you know, the more you spread it"
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Again.. there is no ECC module.. so I have no clue what you are trying to say or what point you are trying to make.

If you are trying to say that the 9th chip has a bitflip, that is detected and corrected. If you are trying to say that the 9th chip has a stuck bit, that will also be detected and corrected. Just like your par2s, if your repair blocks are bad, you know it. On the other hand, if you have a multi-bit error anywhere(including the 9th chip), that will result in a system halt.

If you aren't trying to say any of that you've lost me. But there is no "ECC module". All that there is on RAM is either registers or RAM data chips.
 

jyavenard

Patron
Joined
Oct 16, 2013
Messages
361
Again.. there is no ECC module.. so I have no clue what you are trying to say or what point you are trying to make.


I know I shouldn't bite... but can't help it

you'll have to say that to all the articles, manufacturer and vendor description on how to best describe a memory

http://lmgtfy.com/?q=ECC+Module

to list a few of articles with no clue of what they are saying
https://en.wikipedia.org/wiki/Registered_memory
http://www.crucial.com/kb/answer.aspx?qid=3692
I found that one humourous, as it's the RAM you recommended in your powerpoint presentation:
http://h30094.www3.hp.com/product/sku/10443682/mfg_partno/D2G72K111
"Kingston 16GB 1600MHz Reg ECC Module"
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I know I shouldn't bite... but can't help it

you'll have to say that to all the articles, manufacturer and vendor description on how to best describe a memory

http://lmgtfy.com/?q=ECC Module

Notice that the ECC module search for"ECC module" gave you a first result of ECC memory. There is no secret ECC module on a stick of RAM. Either the RAM has the extra bits for ECC support or it doesn't. Unless you are trying to refer to the RAM as an ECC module. I'm just got getting where you are trying to go.


to list a few of articles with no clue of what they are saying
https://en.wikipedia.org/wiki/Registered_memory
http://www.crucial.com/kb/answer.aspx?qid=3692
I found that one humourous, as it's the RAM you recommended in your powerpoint presentation:
http://h30094.www3.hp.com/product/sku/10443682/mfg_partno/D2G72K111
"Kingston 16GB 1600MHz Reg ECC Module"

As for the RAM I believe I did. I just looked and the only RAM stick I recommended was the KVR16E11k4. That's what I bought, and I can tell you it is not registered. And I'd probably not link to HP RAM as HP has had some really weird RAM that wouldn't work in non-HP systems. So laugh all you want.. I think the jokes on you.
 

jyavenard

Patron
Joined
Oct 16, 2013
Messages
361
There is no secret ECC module on a stick of RAM. Either the RAM has the extra bits for ECC support or it doesn't. Unless you are trying to refer to the RAM as an ECC module. I'm just got getting where you are trying to go.

you're being obtus on purpose or what???

the module, the DIMM, the stick,

the board with memory chips on it. also called a module.
ECC DIMM, what do you think the last M stands for?

Oh.. you linked to registered memory, which has nothing to do with ECC.... how cute!

oh dear.... I was only linking to articles with the expression "ECC module" if you bothered to check

from the first wiki I linked to:
" Although most memory modules are both ECC and registered, there are also both registered non-ECC modules and non-registered ECC modules."

ECC module in that context of course

Kingston .... ECC MODULE <-- see the last two words?

http://www.crucial.com/kb/answer.aspx?qid=3692
" the enhanced features of the ECC modules will no longer be active in your computer. "

http://www.memorysuppliers.com/eccwhatisita.html
"Why do ECC modules cost more than modules without ECC?"

But this is so out of context here... that's it from me...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Ah! Ok. I get what you were saying. I just never hear anyone actually refer to it as a memory module. They say things like "do you have a stick of DDR3?" and not "Do you have any DDR3 modules?"
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,970
Gotta tell you that this last page of reading has been funny. But to be honest, jyavenard, you were a bit confusing but I'm sure you were clear in your mind. I too thought you were implying there was a special module on the RAM stick. I'm glad this is sorted out. And I thought cyberjock put quite a bit of work into posting a good explanation of ECC RAM. Keep in mind that he doesn't know your knowledge level and he, like myself, will explain it like the person receiving the message knows nothing. It just makes things easier than making assumptions and spinning our wheels.
 

jyavenard

Patron
Joined
Oct 16, 2013
Messages
361
Using ECC is better; doesn't mean your system will crash and burn if you aren't....

You're as unlikely to loose data when using non-ECC RAM with ZFS as any other systems (including UFS).

FreeNAS's own iXSystem "mini plus" solution uses an intel i5-3470T. That processor doesn't support ECC RAM. If using non-ECC RAM was that bad; those guys wouldn't do it: you would hope they know what they are doing!

You don't have to use ECC RAM.
How much do you value your data is up to you to decide.
 

Technoid

Cadet
Joined
Oct 26, 2013
Messages
5
Well, I just upgraded my Freenas to use ECC, the diff in cost was about $60 (compared to non ecc ram) well worth it. (The cost is not the issue really, its getting the parts that is)

I was supremely lucky in that the reason that I upgraded was because of failed ram and going to fail motherboard ( Curse all manufacturers of shoddy capacitors ) and I didn't loose anything important.

I found out I had problems when I discovered that I had managed to disconnect the power cable to the server, and after the boot was missing a disk in the pool, that was "just" a disconnected sata cable, which has happened to me before (yeah should have got locking sata cables) but this time it started to repair "errors" on scrub, which for some reason prompted me to run memtest and discover a ram stick with one single bit error. uppon changing that stick , the next scrub actually seemed to correct the earlier induced "repairs".

Now, how do I know that the ram errors and subsequent scrubbings just didn't compound the damage? Well, since I have only been running the freenas for about a year, I used to have all my "important" data on a bunch of single drives on my then NLE computer (that drifted over to being fule server) and that environment is prone to bit rot and backups are worthless unless you can detect it in some way I had it all checksummed.

And while memory errors could change the checksum file in such way that it would not detect a corrupt file, I doubt it...


As for ECC, If you are getting new stuff for a Freenas its well worth it. (And CyberJock is making an ironclad argument in favour of it)

If you are converting old hardware to a franken-nas and are coming from "a bunch of disks on a windows xp computer" ™, your data is still safer on a Freenas running ZFS in raidz on the franken nas than it was on the windows pc even without ECC...

And really We shouldn't have to have this discussion since ECC should be the norm. (Funny enough almost all AMD cpu's support it and has done it since Athlon 64's, but try and find a motherboard that supports it....
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I found out I had problems when I discovered that I had managed to disconnect the power cable to the server, and after the boot was missing a disk in the pool, that was "just" a disconnected sata cable, which has happened to me before (yeah should have got locking sata cables) but this time it started to repair "errors" on scrub, which for some reason prompted me to run memtest and discover a ram stick with one single bit error. uppon changing that stick , the next scrub actually seemed to correct the earlier induced "repairs".

Now, how do I know that the ram errors and subsequent scrubbings just didn't compound the damage? Well, since I have only been running the freenas for about a year, I used to have all my "important" data on a bunch of single drives on my then NLE computer (that drifted over to being fule server) and that environment is prone to bit rot and backups are worthless unless you can detect it in some way I had it all checksummed.

If you have a backup that wasn't sourced from the original pool you could do a checksum compare(md5, sha256, etc.) and see if the files are the same or not. The reality is that for about 90% of forum users, they have no backup(sigh...) so all they can do is scrub and hope for the best. I'd do 2 scrubs, one to "fix" the errors, then one to verify it. If the error keeps showing up you probably have some other metadata corruption or something else causing wonky things in ZFS. The only solution at that point is to destroy and recreate the pool. Or, if it is actually reporting it as "corrupt" then -v might fix it.

Keep in mind that file data that was corrupted in RAM, then checksummed/paritied and stored on the pool that way is unable to be detected or corrected. It was bad in RAM before it was written, then stored on the pool bad along with the parity/checksums themselves expecting the actual data to be good(but it was actually bad).

If you have had scrubs performed with the bad RAM the damage can range from very light to extremely bad. It's totally luck of the draw as to how bad it was for you. Virtually all users who have had bad RAm have had unmountable pools(guaranteed 100% loss of data) or pools that are so corrupted that scrubbing results in millions of errors. Trying to open files results in widespread corruption ranging from files that won't open in MS Word to videos with corruption every second or two.

And for everyone that used rsync and/or zfs replication with a regular schedule, the backup pools always ended up equally trashed. :(

And while memory errors could change the checksum file in such way that it would not detect a corrupt file, I doubt it...

You are correct. While collisions are totally possible, the likelyhood of corruption that happens to leave the checksum correct is so unlikely it isn't even worth considering. Heat death of the universe is probably more likely.

In your case, with a single bit error you have a chance for either the data itself or the parity/checksum to be bad. Not both. If the data itself was corrupted your parity/checksums will also be bad and no repair can happen. I'd say you probably wouldn't even be able to prove that any corruption has taken place. But if its the parity/checksum then there's a chance your data might be reparitied properly.

But without doing some scrubs you might not have the redundancy that you are expect. I'd make scrubbing a high priority. I'd also make sure that no errors occur for a full scrub before you assume the pool is "safe".

As for ECC, If you are getting new stuff for a Freenas its well worth it. (And CyberJock is making an ironclad argument in favour of it)

If you are converting old hardware to a franken-nas and are coming from "a bunch of disks on a windows xp computer" ™, your data is still safer on a Freenas running ZFS in raidz on the franken nas than it was on the windows pc even without ECC...

And really We shouldn't have to have this discussion since ECC should be the norm. (Funny enough almost all AMD cpu's support it and has done it since Athlon 64's, but try and find a motherboard that supports it....

Yeah, I'm an Intel user myself, and I'm plenty disappointed that Intel has made using ECC so difficult to use. It does pad their profits though... :(

AMD does get some credit for this as they seem to have better ECC support. But they also stab their prospective users by having motherboards from every walk of life with many weird branded SATA/USB/etc controllers. So it's a trade off of sorts. For example, far more AMD users have problems with compatibility of SATA controllers and USB controllers on their motherboards because AMD doesn't bundle a SATA controller by default. Intel does and the Intel SATA chipsets are VERY well supported by Intel with regards to FreeBSD. With AMD there's literally about 100 possible SATA controllers you could end up with, and those can range from well supported to not supported at all under FreeBSD.
 

Technoid

Cadet
Joined
Oct 26, 2013
Messages
5
You are correct. While collisions are totally possible, the likelyhood of corruption that happens to leave the checksum correct is so unlikely it isn't even worth considering. Heat death of the universe is probably more likely.

In your case, with a single bit error you have a chance for either the data itself or the parity/checksum to be bad. Not both. If the data itself was corrupted your parity/checksums will also be bad and no repair can happen. I'd say you probably wouldn't even be able to prove that any corruption has taken place. But if its the parity/checksum then there's a chance your data might be reparitied properly.

But without doing some scrubs you might not have the redundancy that you are expect. I'd make scrubbing a high priority. I'd also make sure that no errors occur for a full scrub before you assume the pool is "safe".

Made several scrubs after changing the faulty ram (and after the scrubs stopped reporting errors), and also after new mobo, cpu ram.

The data was checked against md5 checksum files on another computer. (and against backups on an entirely diffrent server).


Yeah, I'm an Intel user myself, and I'm plenty disappointed that Intel has made using ECC so difficult to use. It does pad their profits though... :(

AMD does get some credit for this as they seem to have better ECC support. But they also stab their prospective users by having motherboards from every walk of life with many weird branded SATA/USB/etc controllers. So it's a trade off of sorts. For example, far more AMD users have problems with compatibility of SATA controllers and USB controllers on their motherboards because AMD doesn't bundle a SATA controller by default. Intel does and the Intel SATA chipsets are VERY well supported by Intel with regards to FreeBSD. With AMD there's literally about 100 possible SATA controllers you could end up with, and those can range from well supported to not supported at all under FreeBSD.

Amd's policy on not building their own athlon chipsets (apart from what could be considered engineering samples) and relying on VIA ( the only VIA I even remotely trust is poured in a washing machine) to make mobo chipset for their cpus was INSANE! and unfortunately now that they do they are still insane in how they make em ;)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Did those files on the other computer and backup server come from your FreeNAS server originally? The source is the key factor. If the file is trashed on your primary server, then backed up the backup will naturally match the primary since they're both corrupted the same. So if you got those copies from the FreeNAS server they aren't a good way to verify the files are good. This gets most people because generally you don't save your files to the FreeNAS server, then to a second server. You save it to the FreeNAS server and let FreeNAS make the backup for you. In those cases you have no way to prove that corruption did or didn't happen except to use the file and check it. For videos though, you'd have to watch it all the way through. LOL.
 

Technoid

Cadet
Joined
Oct 26, 2013
Messages
5
Actually, most of the files in question was checksummed with an checksumming app several years ago and is duplicated on several diffrent storage spaces ( my computer, dvd's, remote freenas servers), the most recent stuff originated on my main computer and was first checksummed there and then copied to both my freenas and ftp'd to another.

So any corruption that isn't detected by using the checksumming app and the checksum files was introduced before going to the freenas ;)

And as I see it, a freenas server is for having an enormous "drive" that are easy to share on your local lan and has reasonable safety against bit rot (if you have ecc :p )and depending on how many redundancy hdds also reasonable safety against hdd failure but it isnt magic and it doesn't replace backups or independent checksumming of files (so that you dont backup corrupted data)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
How were you keeping the data up to date? were you using Rsync? Cause rsync can actually trash your original files if you tried to rsync from FreeNAS to your backup locations.

Other than that, you might be the first person to be lucky enough to not lose their data. If so, go buy a lottery ticket NOW!
 

Technoid

Cadet
Joined
Oct 26, 2013
Messages
5
How were you keeping the data up to date? were you using Rsync? Cause rsync will actually trash your original files if you tried to rsync from FreeNAS to your backup locations. ;)
You have now scared me from ever using rsync.... :p
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Rsync is very trustworthy. But with bad RAM no program is safe from doing things wrong. Rsync is no exception.
 

patmuk

Cadet
Joined
Nov 3, 2013
Messages
1
Hi,

having read the whole thread I would like to summarize how ECC works based on what cyberjock so excellently explained (and jyavenard got wrong):
You need three components for it to work:
- ECC RAM, with is standard ram with a (standard) module more, storing the parity bits (64+8=72 bits)
- ECC capable CPU, which is calculating the checksum by reading all 72 bits and checking the to-be-used 64 bits with the additional 8 bits, repairing a single bit error or stopping the system immediately,
- ECC capable Motherboard, which is logging if an unrepairable error occurred.

I might be wrong by which component does what, but that is what I read from the explanations and what sounds logical to me.

So a faulty ECC RAM can only be faulty in the storage of bits, as it does nothing else. If the checksum calculations (off the ECC RAM) would be faulty, it would be a faulty CPU. And if the CPU would be somehow faulty, the system would very soon hang, because unlike a faulty RAM, where the error only occurs when the RAM is accessed and might lead to a crash, if the wrong bit leaves the system in a inconsistent state, a wrong calculation in the CPU is wrong on every calculation step, and will much earlier lead to a inconsistent state, crashing the machine.

Nevertheless, I came to the thread to see if I can build a cheap replacement for my drobos without expensive ECC RAM (and Mobo + CPU), but I understood how important that is, and why RAID boxes are that expensive :)

And why I need a backup as well ... so for now I might stick with my drobos (I have two identical ones to recover from a hardware failure) and go for a ZFS based build once one of them dies.
 
Status
Not open for further replies.
Top