(Emphasis added.)Rob N. said:The "write" part of the file change can be anything - write, clone, fill, etc. What may make a difference is the relative speed of those operations, and obviously a clone is much faster than a write. There may also be a second bug in cloning that contributed to the this that we haven't found yet. This is part of the reason that cloning is still disabled in 2.2.2.
cp
). Instead, the corruption occurs when ZFS tells an application the wrong thing at read time (saying there is a "hole" in a file when, in fact, there is data there).@HarambeLives here's my analysis of how the bug impacts copies.Hence the lack of checksum failures, because, if I understood correctly, when the bug hits, data is written correctly to some destination, even if that's corrupted data, and then at a later time that destination checksums OK when read once again and/or when scrubbed… correct?
What about zfs replications to eg. another Truenas ?Put otherwise: The bug does not corrupt data that is already in the pool; it's a "read bug" which returns wrong data, which may result in corrupted copies being stored—originals are safe.
Scrubs are of no use: The original is fine, but the copy was written with its own checksum and is "properly corrupted".
How come? Because if the bug is about incorrectly reporting holes, wouldn't it stand to reason it could also occur while reading the blocks that make up a snapshot?Not affected by this bug.
Replication works at the block level. The bug is about user space programs operating at the file/vnode interface actively using "hole aware" system calls. ZFS replication is completely file agnostic.How come? Because if the bug is about incorrectly reporting holes, wouldn't it stand to reason it could also occur while reading the blocks that make up a snapshot?
Right, I understand that replication works at the block level. But, at the end of the day, supplying the data for a file that a userland tool intends to copy translates into reading the blocks that make up that file when the ZFS layer is asked for its data.Replication works at the block level. The bug is about user space programs operating at the file/vnode interface actively using "hole aware" system calls. ZFS replication is completely file agnostic.
Well, I don't pretend to be to too knowledgeable about ZFS internals, even if I consider myself experienced enough with the filesystem to the point of having managed to pull myself out of several rabbit… holes over the years (debugging pools a few times with zdb, migrating from GELI to ZFS native encryption via replication, etc.).My understanding is that it occurs at the file system layer and not in ZFS' operation on blocks. Now that you keep asking, I ponder if I might be wrong ...
I was hoping for an earlier update with the current ZFS version ("hotfix"). Unraid, for example, already has ZFS 2.1.14
This is a very important point to keep in mind. In fact, nobody seems to have come forward with any "now that you mention it, back in the day I saw this in the wild" stories.he original bug lasted 15 years without detection