Techno-ramble on ECC:
This is a fairly accessible paper on ECC with only mild excursions into hard math:
http://www.hackersdelight.org/ecc.pdf
ECC as a theoretical discipline is an entire field of specialized coding theory. As such, there are many ways ECC is done, and many different ECC "code sets".
Each variant of ECC substitutes a "word" (i.e. specific set of bits) in the code set of valid ECC words for a word in the non-ECC word being protected. In this way, ECC can be viewed as a form of encryption - one plain-text word is represented by a different word in the ECC code set. It's not designed to hide the original bits, as the ECC word generally has the original bits in it and the ECC bits tacked onto the end or interspersed in the original, but the idea is that one plain-text word is equal to one and only one ECC'd word.
The words in the ECC code set have more bits than the original words. So a 64 bit "plaintext" word will have more bits in the ECC word that represents it. The whole point of adding those bits is that you can do logic operations on the plaintext and extra bits and come up with the result that the original word is correct or not, and may be able to figure out what the original plain-text word was and thereby "correct" the error by reporting the original plaintext bits that the extra ECC bits let you compute. ECC codes are obviously constructed so that this computation of whether there was an error and what the original plaintext was can be done VERY quickly, generally with a few hard-logic gates, not software.
One result of the design of the ECC code set is that each code set and checking logic has (potentially at least) different abilities to detect and correct errors, and different "blindnesses", errors it can't see at all. Parity is a simple single-error-detection code: it can "see" all single bit errors, but cannot correct any errors. But double bit errors cannot even be seen by parity, because two-bit errors transform one valid ECC word (the plain text word plus the parity bit) into another valid ECC word in the code set, but a different word than the original plaintext. In fact, parity detects all errors which involve one, three, five... odd numbers of bit error, but is blind to all even numbers of bit errors.
Adding more bits of ECC checking lets you start detecting more errors, cutting down that "blindness" to errors.
The ECC process used in computer RAM is designed to be single-error-correcting, double-error-detecting. What it does for three, four, and more errors depends on exactly what coding is done in the ECC process. It may or may not detect triple errors, quadruple errors, and so on.
Heat-death-of-the-universe arguments are fun, but they depend on the chances of each bit being in error being independent. The idea of speculating about what errors can happen and how many bits are likely to be in error (and, of course, being right about the speculations! :) ) supports many math and computer architecture professionals. What happens if, say, four bits are right next to each other on the RAM chip and a cosmic ray hits in the middle of the four, not just one? Or in a more pedestrian possibility, what happens if one entire RAM chip dies? The chip design and computer architecture pros have considered such matters, and so the insides of the RAM chips are messed with as are the ECC codes to minimize the exposures, but Mother Nature still has many ways to corrupt your data. For instance, one asteroid strike will likely get it all. :D
My own personal "right answer" is to do all I practically can to keep my data correct, but to spend only as much as the data is worth to do so. Backups are the first line of defense. Next is good quality hardware. Next is cleverness in selecting how your data is handled and stored. In all of this I feel that I need to stay flexible, because there is no The Answer, only more steps in the path.