One or more devices has experienced an error resulting in data corruption.

Status
Not open for further replies.

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
At $OldJob, we use to qualify embedded computers for operation from -40c to +85c. It was only this high because they were sealed, and therefore the internal temperature got up to 85 or so.

Yeah, I come from the land of hospital monitoring devices. It's a lot more entertaining when you ratchet up the paranoia levels to reach outside the normal environmental range.

-40c... something outdoorsy?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
-40c... something outdoorsy?

Yea, oil field equipment that quite commonly operated in northern Alberta / and the territories. But because it was sealed, and could very well be in direct sunlight at 30-35c, it's internal temperature could end up at 85.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
When I worked at a nuclear power plant we had a few embedded systems that were a little computer with something like rubber cement poured into the cavity and it filled the entire "case". For us, that was part of the ratings needed to ensure proper operation when under water, high humidity (such as steam.. .whoops) and other extreme conditions. It was pretty cool to see it and such, but there was no fixing it. If it didn't work you replaced the whole thing.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I remember when companies like Commodore used to fill their power supply bricks with epoxy, just to be sphincters.
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
Even if his RAM is bad, it doesn't necessarily mean the data on his pool is corrupted (even though ZFS thinks that it is.)

Bad RAM can certainly trigger cksum errors to be logged to the pool during a scrub, but due to the precise way ZFS is designed, it's pretty unlikely that any data corruption due to bad RAM would actually be committed to disk for files that were already on the disk before the RAM went bad.

You either need to have RAM that is so horribly broken that I don't know how your system hasn't crashed yet, or you need to see the unlikely scenario of a hash collision in order for existing data already on disk to be overwritten by corrupted data due to bad RAM.

In my time with ZFS I've certainly seen systems and pools that reported files as corrupt, but when transplanted into a non faulting system, the files were recovered and when compared to the original files, were actually not corrupted at all. ZFS can and does report false corruption in the face of bad RAM.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That's not necessarily true. ZFS does attempt to correct errors when a read results in a corrupt block.
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
That's not necessarily true. ZFS does attempt to correct errors when a read results in a corrupt block.

But that relies on a bad HDD. I was really referring to where the disks are working fine, and the RAM is the only thing malfunctioning. OP posted his SMART status and there is no real reason at this time to think his HDDs are getting any errors.

If ZFS detects a block being read doesn't match the checksum, but the disk didn't return a hard read error, then ZFS will only repair the block in memory so it can serve the user a "repaired" copy of that block. It doesn't actually edit the block on-disk in a case such as this.

A read will only cause ZFS to overwrite a block on the disk if the read is accompanied by a URE.

Note that when I say overwrite, I mean write a new block and update the block pointer to that new block.

A scrub can update blocks on-disk, but only if ZFS knows that it found and verified an un-corrupted "copy" of the block to replace the block it thinks is corrupt with.

It works like this. Scrub reads a block into RAM, then it calculates the checksum and compares it with the checksum ZFS stored back when the block was originally written. Say the RAM is bad and the RAM either corrupted the block as it was written into RAM, or it corrupted the checksum as it was written into the RAM. Either way, when ZFS compares the two it will say there is a cksum error since they don't match (even though the block on the disk is actually fine). ZFS will log this cksum error to the zpool status.

Now it will attempt to repair the data. First it reads in the parity or mirror data for the block and from that, it reconstructs a redundant "copy" of the block. It also reads in the checksum for this redundant copy of the block that was stored when the mirror or parity blocks were first created. Now it compares the checksum it just now computed for the redundant block, to the checksum that was there from when it was originally stored.

Two things can happen here. Either this redundant block copy was read into good RAM this time and ZFS finds that it matches its original checksum (thus verifying the redundant copy is "good"), and then it now replaces the original bad data block with this new and verified good block.

Or, when ZFS reads in the redundant copy block and redundant block's checksum, the bad RAM also corrupts one of these. This time when ZFS compares the two, it will see that it also doesn't match. ZFS will in this case so far not overwrite anything on-disk yet, because it didn't find a verified good copy to use yet. ZFS then checks the next redundant copy (higher n-way mirror or RAIDZ2 or Z3).

If all these additional copies also get corrupted as they are written to RAM, then ZFS will ultimately abort the repair operation for that block and it will not overwrite anything on-disk. It will then report that there are corrupt files in the pool that it cannot repair. But in reality, the original block is still on the disk and is in fact not corrupt even though ZFS thinks it is.
 

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
It feels to me that that is exactly what's happening here: a haunted consumer PC with some small but fatal-like error 'somewhere'. I'd love to hear if importing the disks into another machine would solve all problems. I must say I really like what you wrote down here! Gives a secure feeling too :)

Thanks!

Devvie
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
But that relies on a bad HDD. I was really referring to where the disks are working fine, and the RAM is the only thing malfunctioning. OP posted his SMART status and there is no real reason at this time to think his HDDs are getting any errors.

If ZFS detects a block being read doesn't match the checksum, but the disk didn't return a hard read error, then ZFS will only repair the block in memory so it can serve the user a "repaired" copy of that block. It doesn't actually edit the block on-disk in a case such as this.

A read will only cause ZFS to overwrite a block on the disk if the read is accompanied by a URE.

My understanding of this is that ZFS will 'self heal' the disks whenever it detects checksum errors. It doesn't matter if this is via a regular cifs read of the file, or a scrub.

Take a two drive mirror. If you manually erase some sectors from one of the disks, then reimport the pool, and read a file back that occupied some of those blocks, zfs will realize that one of the disks is returning bad data. It will then read the data from the other disk, which will correctly match the checksum. The good data is returned to the client, and also written back to the 'bad' disk so that full redundancy is restored.

This is one of the points of a self healing filesystem. But you do have to trust the memory subsystem, because if the memory corrupts the data, and not the disk, zfs has no way to know.

This was gone through step by step in one of Jeff Bonwick and Bill Moore's presentations on ZFS.

I think this is about where in the video they describe it:

https://youtu.be/NRoUC9P1PmA?t=57m45s

Edit: Specifically time 1:00:15 is where he describes the 'bad' disks data being updated with good data after an application read. 57:45 is where the self healing section starts though.
 
Last edited:

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
Yes, but the repair process that ZFS uses to repair the block is as I described whether during read or scrub. It verifies that the redundant copy (that ZFS wants to use to repair the original block it thinks is corrupt) is itself valid. If the redundant copy is not verified valid (because RAM also corrupted it) then it wont overwrite the original block. If it does check out to be valid, (then the bad RAM didn't corrupt it that time), and we overwrite the block with correct data, whether or not the block we are fixing was bad in the first place or not.

My original point is just that bad RAM can also create false positive "corrupt" data as far as ZFS is concerned.

People are so willing to understand that ZFS can't function entirely as it should if the RAM is bad. This is obvious, that your data can't really be trusted anymore. But in the case of bad RAM they seem to for some reason still fully trust "zpool status" What they fail to also realize is that ZFS's admin functions that keep track of metadata such as which blocks on disk are good and which aren't can also be wrongly influenced by bad RAM. "zpool status" can easily be lying because it got confused by the bad RAM. When you have bad RAM all sorts of funky ZFS failure scenarios can happen including the case where the data is fine, but ZFS incorrectly thinks it's corrupt.

At least in my experience debugging with ZFS and with faulty RAM I saw that ZFS logged cksum errors and marked data as corrupt more often than it actually corrupted data that was already good on-disk. Of course new incoming data could easily be written corrupt from the start and there isn't really anything reasonable ZFS can do about that one.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, I have a stick of ECC RAM (server grade stuff) that is bad. It has lots of multi-bit errors that should have triggered a MCE for a user, but for some reason didn't. I plan to put this in a test system (along with known good RAM) so this kind of thing can be put to rest for good. ;)
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
Well, I have a stick of ECC RAM (server grade stuff) that is bad. It has lots of multi-bit errors that should have triggered a MCE for a user, but for some reason didn't. I plan to put this in a test system (along with known good RAM) so this kind of thing can be put to rest for good. ;)

Sweet. More data points on this stuff is great as there aren't that many people testing this kind of thing in a public capacity. No doubt SUN devs tested this stuff a LOT while developing ZFS, but its not like we can simply read through all their old testing results unfortunately.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah. I've been trying to get my hands on a stick of RAM that was ECC and bad so we can test the living heck out of it. I'm glad that the person was willing to ship it to me. More will come once I get some time to test this further. :D
 

devnullius

Patron
Joined
Dec 9, 2015
Messages
289
well, keep us posted in this thread please of any results? :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I will be making a thread based on my findings, because the discussion is *far* larger than this thread.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I will be making a thread based on my findings, because the discussion is *far* larger than this thread.
Looking forward to it. Some weird cases have left me a bit worried...
 
Joined
Nov 11, 2014
Messages
1,174
You, uh, are nowhere near 53% done. You might be 53% done with a single pass, but you need to run memtest for days or even weeks to be reasonably assured of detecting problem memory.

Do you run memtes86 v.4 in BIOS mode or memtest86 v.6 in UEFI mode ?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm actually not too fussy. Since the systems I'm testing are usually ECC, the big thing is simply to get memtest pounding on the memory subsystem as hard as it can. All CPU's, all memory.
 
Joined
Nov 11, 2014
Messages
1,174
I'm actually not too fussy. Since the systems I'm testing are usually ECC, the big thing is simply to get memtest pounding on the memory subsystem as hard as it can. All CPU's, all memory.

By default is on 1 cpu core unless it was changed to the user. You are saying is better to put on all cores ?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
By default is on 1 cpu core unless it was changed to the user. You are saying is better to put on all cores ?

Yes, it generates more traffic to the memory subsystem and exercises more parts of the system. The goal is to tease out any flaws in the system. Driving it harder as part of the burn-in process means that it shouldn't have any trouble shouldering a lighter NAS load later.
 
Status
Not open for further replies.
Top