Sorry about the "?!" but that's about how I feel
I've just been testing my FreeNAS install. It's on good hardware (SuperMicro, Xeon, 96gb ECC, Chelsio+Intel), was clean installed on 9.10.2 + upgraded to 11.0 latest via GUI, and has never been configured or modded other than via GUI. Not much functionality is in use at the moment either - one share type, SSH, no VMs or extensions/plugins/jails. It's sharing the data store via Samba to a Windows network.
The permissions/ACLs and Samba setup do have issues which I can't quite fix as I'm still hazy on a bunch of perms+ACL stuff, but that's a whole different problem, and wouldn't cause this problem (they would affect file access not file content, meaning perhaps the share or some files/dirs couldn't be browsed/traversed/read/written as desired).
So anyhow as part of testing the NAS which I do periodically, I copied a directory of about 3000 photos and files (13gb) from Windows 8.1 to the server and back. Of the 3000, about 500 had different hashes when I compared the originals to the round trip. I've never seen that before. Things I've done to try and figure what's going on:
I've never heard of FreeNAS (or indeed anything FreeBSD) having an issue like this. The data on it seems safe, but right now I'm scared to trust my NAS until this is resolved. What's going on?
I've just been testing my FreeNAS install. It's on good hardware (SuperMicro, Xeon, 96gb ECC, Chelsio+Intel), was clean installed on 9.10.2 + upgraded to 11.0 latest via GUI, and has never been configured or modded other than via GUI. Not much functionality is in use at the moment either - one share type, SSH, no VMs or extensions/plugins/jails. It's sharing the data store via Samba to a Windows network.
The permissions/ACLs and Samba setup do have issues which I can't quite fix as I'm still hazy on a bunch of perms+ACL stuff, but that's a whole different problem, and wouldn't cause this problem (they would affect file access not file content, meaning perhaps the share or some files/dirs couldn't be browsed/traversed/read/written as desired).
So anyhow as part of testing the NAS which I do periodically, I copied a directory of about 3000 photos and files (13gb) from Windows 8.1 to the server and back. Of the 3000, about 500 had different hashes when I compared the originals to the round trip. I've never seen that before. Things I've done to try and figure what's going on:
- Does the corruption occurs on saving to FreeNAS, reading from FreeNAS, or both? Checked by hashing the original data and final data on the workstation, and the intermediate copy on the server using "
find . -exec sha1{} >> hashes.txt\;
". Result - the corruption seems to be on the workstation -> file server trip, when FreeNAS receives and writes the files. (Workstation original != Server data; Server data == Workstation copy) - Is the corruption always on the same files or the same changes? Copied the same dataset 3 times successively from client to server to 3 dirs in the same parent dir on FreeNAS. DIrs named SHARE_ROOT/dir1 through SHARE_ROOT/dir3 to ensure no effect due to casing, parent dir, or dir name. Result - different files were affected each time, and the data in the 3 copies didn't always have the same hashes as each other.
- Is the workstation at fault? Repeated using 2 other networked Windows machines. Same kinds of results.
- Are there any reported integrity errors on the FreeNAS server? Checked server integrity using GUI, and ran a scrub on the boot/system volume. Result - both passed, no errors detected.
- Is it related to long filenames, long paths, casing, or unusable characters? Seems not: the photos have standard ASCII filenames < 30 chars long and folders not nested deeply. As the folders are newly created the files won't have multiple copies with different cases, and the count of files and their names in each dataset copy is identical.
- Is it related in any way to file system metadata or ADS on the Windows side? Unlikely - if this were a problem then the hashes would still all be the same on the server, because they're computed the same way from static data at rest, even if the server-side hash was different from the hashes I got on Windows. (I've also used hash checking for years and never found metadata or ADS to be included in the hash that's computed on Windows, whatever software I use).
- Network reliable? Not entirely sure how to best check this. I can't see any obvious signs of errors, and SSH uses SSL which would be sensitive to network issues I guess. Any suggestions how to test the network end-to-end in case of something weird, would be worthwhile.
- Does the issue seems to be linked to a specific file transfer/share mechanism? Repeat using (for example) WinSCP or sharing via NFS instead of Samba, and see if the data is reliable or also corrupted? I should do this, and will add it above when done.
- Any consistency to the corruption when it happens? For example, single bytes, truncation, always a similar block or anything else in common, when checked with a hex editor? Not done yet, can do if needed.
- Clean install FreeNAS and import the data drives + config? Can do. A bit apprehensive though.
I've never heard of FreeNAS (or indeed anything FreeBSD) having an issue like this. The data on it seems safe, but right now I'm scared to trust my NAS until this is resolved. What's going on?
Last edited: