How OpenZFS Provides Data Integrity Where Other File Systems Don’t

}

July 17, 2015

The most important feature customers expect from a storage array is data integrity protection. This is why we base TrueNAS and FreeNAS on the OpenZFS enterprise-grade, open source file system. Unfortunately, the file systems used by other vendors and projects rarely take the same precautions as OpenZFS and can blindly store and return you corrupt data. The root causes of on-disk data corruption range from interrupted or “shorn” writes with and without hardware RAID devices, to interference from cosmic radiation.

By checksumming data blocks upon write and verifying data checksums upon read, OpenZFS will never return you corrupt data as if it were good data. In addition to extensive checksumming, OpenZFS is a “Copy on Write” file system that includes various redundancy strategies to guarantee the integrity of your data.

Addressing the Infamous “Write Hole”

When writing, modifying or reading files to or from disk, most traditional file systems and hardware RAID controllers assume the success, rather than failure of these operations. This can lead to a number of problems including a false sense of security. To begin with, if a write operation is interrupted by something like a power failure and a write operation to a file is interrupted mid-write, the remaining data is simply lost and an incomplete file is left on disk. The file is available to users but is effectively corrupt.
To accommodate this scenario, OpenZFS checksums every new data block upon completion of each write operation and will verify each checksum when a read operation is performed. If the checksum verification fails, the read operation will fail and the user is presented an error or the previous version of the file, rather than corrupt data.  This strategy has the added benefit of revealing silent data corruption which is critical for archive and backup storage arrays.
A CERN study showed that hard disks can exhibit a bit error or bad sector in as little as every eight terabytes of data that is stored. Active storage arrays can transfer eight terabytes in a matter of weeks or even days, making this a common occurrence we simply never notice until it is too late.

By verifying data block checksums with every read operation, OpenZFS will only return valid data.  Should a duplicate block of the same data exist elsewhere such as on a RaidZ array, OpenZFS will not only return the valid copy but will also correct the invalid one.

Furthermore, while a hardware RAID card may take precautions such as generating parity data for the data it stores, the write operation for that parity data could be interrupted even though the data blocks it represents were successfully written to disk. The result will be either immediately corrupt data or parity data that cannot successfully rebuild a failed member disk of the array. This scenario is most closely associated with RAID 5 storage arrays as the “RAID 5 Write Hole”. It is important to note that this can also occur in RAID 4 and RAID 6 arrays, and even RAID 1 mirrors thanks to the data caching that takes place at various levels.

How OpenZFS eliminates the Write Hole problem with Copy on Write

To provide this unprecedented level of data integrity protection while maintaining a high level of performance, OpenZFS organizes its on-disk data blocks in a special hash tree called a “Merkle tree” consisting of parent and child data blocks. Each parent block contains the metadata and checksums information of its child blocks. When a data block is modified, the original data always stays in place and the modified data is written to a new location. Only when the new block is successfully written are the related parent blocks notified of the change up through to the top level of the tree.
CopyOnWrite

Conclusion

Hopefully you now have a better understanding of what steps OpenZFS takes to guarantee the integrity of your data and why you will never want to use a legacy file system or hardware RAID card again. Data corruption caused by shorn writes, the Write Hole or silent data corruption occurs far more often than we realize and most file systems simply take no measures to tell us that we have lost data. We base TrueNAS and FreeNAS on OpenZFS because it provides these unprecedented data integrity protection strategies. For more information on TrueNAS, visit staging-www.ixsystems.com:8084/truenas or call 1-855-GREP-4-IX.

Share On Social: