Reliability of a single disk ZFS volume

NASbox · Aug 24, 2017

tldr;
Is ZFS suitable for single disk use, or does ZFS make total data loss more likely than file systems like NTFS/FAT?

Why I'm asking/what I am trying to achive/understand.
I need some major space for nightly backup images. I plan on setting up 2 separate 1 drive pools (BPOOL1/BPOOL2) and imaging my desktop systems nightly on a rotating basis. That would give me 2 copies of my data. If one pool gets corrupted/disk dies, I would have another copy from the prior day on the other pool. This data will only be accessed if something happens to one of the desktop systems.

Can someone comment on the relative data security of a single disk ZFS pool vs keeping the same data on a NTFS/FAT32 volume?

ZFS can detect bit rot where as NTFS can't, but is there a downside to ZFS over other file systems?

While I know it is possible to loose a whole disk (head crash/motor/electronics failure), my experience has been that I usually get an error on one or two files before the whole drive goes and I can copy the data off the rest of the disk.

What about ZFS? If the disk happens to develop a bad spot and one or two files get corrupted, does this prevent access to the rest of the volume or would I be able to copy the data off the rest of the disk? In the "old days" I used a brain dead tape backup drive that could not work past an error. If there was a tape read error near the start of the tape, everything after that point would be lost. Is ZFS reasonably fault tolerant, or does it depend on redundant disks to prevent total data loss? (Obviously redundancy is required to prevent loss of individual file corruption, but I'm OK with chancing that.)

Thanks in advance for any advice/assistance.

Arwen · Aug 24, 2017

Yes, using ZFS on single disks has it's uses and benefits.

ZFS always stores 2 copies of meta-data, (like directory entries), and 3 copies of critical meta-data. Thus, ZFS is pretty fault tolerant of simple bad blocks. IT will tell you when it detects bad blocks that are in use. And if there is no redundancy, (plain data blocks when copies=1), then those file pieces are gone. But, if there was redundancy, it will fix it automatically. Unlike NTFS/FAT32, where you never know if the data files are good, unless you have an outside source of verification.

I use 4 single disks for backs using ZFS. 2 are 2.5" 320GB for client backups, (3 Linux clients). Then 1 x 3.5" 8TB for my entire FreeNAS backups. And 1 x 3.5" 750GB for my FreeNAS backup which does not include media files. Thinking about it back when I made the decision to use ZFS, I would rather loose the entire disk, (because I have multiple backups), then have corrupt data. Of course, I trust ZFS not to corrupt the disk, but disks do go bad.

Just remember, if your main clients are MS-Windows, then you need a *nix server of some type to allow ZFS. FreeNAS, FreeBSD or Linux.

melloa · Aug 24, 2017

NASbox said:
That would give me 2 copies of my data

As you are doing two one drive volumes, I'd go with a mirror...

NASbox · Aug 24, 2017

Arwen said:
Yes, using ZFS on single disks has it's uses and benefits.

ZFS always stores 2 copies of meta-data, (like directory entries), and 3 copies of critical meta-data. Thus, ZFS is pretty fault tolerant of simple bad blocks. IT will tell you when it detects bad blocks that are in use. And if there is no redundancy, (plain data blocks when copies=1), then those file pieces are gone. But, if there was redundancy, it will fix it automatically. Unlike NTFS/FAT32, where you never know if the data files are good, unless you have an outside source of verification.

I use 4 single disks for backs using ZFS. 2 are 2.5" 320GB for client backups, (3 Linux clients). Then 1 x 3.5" 8TB for my entire FreeNAS backups. And 1 x 3.5" 750GB for my FreeNAS backup which does not include media files. Thinking about it back when I made the decision to use ZFS, I would rather loose the entire disk, (because I have multiple backups), then have corrupt data. Of course, I trust ZFS not to corrupt the disk, but disks do go bad.

Just remember, if your main clients are MS-Windows, then you need a *nix server of some type to allow ZFS. FreeNAS, FreeBSD or Linux.

Hi Arwen

Thanks for the reply. I don't have any experience with disk problems on a single drive pool. I have had problems on RaidZ2, and I just swapped the drive, resilvered... easy once I understood the procedure. If I recall correctly, when an error was detected, the bad disk dropped offline-great for RaidZ2, but not so great for a single disk.

If there are redundant copies of the metadata, then it should be possible to read all the non-corrupted data as long as there is a mechanism to allow it. Will I still be able to access a single drive pool with a few corrupted files? (Assuming readable metadata and drive is recognized by BIOS/FreeNAS?)

What is the best way to copy BadDisk -> GoodDisk skipping/logging file names of files that have read errors?
It would be crazy to lose a multi-terabyte drive if there was only a few corrupted files.
I know there are programs like that for windows, what about FreeBSD/FreeNAS? Suggestions?

This may be poorly charted waters as I would assume most pools are redundant.

My plan is to use FreeNAS as a centralized backup server to pull from the Windows clients. I just saw a reference to UrBackup ( https://www.urbackup.org ) in the FreeNAS forum. Looks like this may be a good program to use for multi-platform backup.

Stux · Aug 24, 2017

You know, the best way is to avoid the problem by having redundancy in the first place. Then you would only *need* to do this if your redundancy failed.

NASbox · Aug 24, 2017

Stux said:
You know, the best way is to avoid the problem by having redundancy in the first place. Then you would only *need* to do this if your redundancy failed.

I'm going back to the quote I've seen (don't remember who said it, likely several people) RAID is not BACKUP. These pools are essentially backup copies so in most cases more copies at different times is better than more copies of one backup.

Stux · Aug 24, 2017

NASbox said:
I'm going back to the quote I've seen (don't remember who said it, likely several people) RAID is not BACKUP. These pools are essentially backup copies so in most cases more copies at different times is better than more copies of one backup.

yes, raid is not backup, but a backup on a raid, is a backup.

personal example time, my primary nas backups up to my backup nas and to my offsite nas. All use RaidZ2 for redundancy. And snapshots for history. Offsite NAS has more duties than just being an offsite backup, so it backs up to the primary as well.

NASbox · Aug 25, 2017

Stux said:
yes, raid is not backup, but a backup on a raid, is a backup.

personal example time, my primary nas backups up to my backup nas and to my offsite nas. All use RaidZ2 for redundancy. And snapshots for history. Offsite NAS has more duties than just being an offsite backup, so it backs up to the primary as well.

I get your point and that is an ideal way to handle backup. I don't have the luxury to go that far. Unless I'm dealing with a pool that is very unsafe (like a 2 disk striped volume - way higher chance of failure than a single disk) I'm comfortable with a single disk pool (if it's equal to or better than if it were stored on an NTFS drive), then I'm good.

What I really need to know is how workable is a single drive pool with a minor defect (absent defective electronics, or a head crash that takes out the whole disk)?
How will the system react?
Are there utilities to get the readable files off?

As long as the ZFS pool will continue to function (just skipping the failed items), then my gut feeling is that backups taken at different times on different volumes would be better. RaidZ is going to mean I will need twice as many drives for the same amount of storage.

Stux · Aug 25, 2017

you should be able to copy the files which aren't corrupted as if they weren't corrupted.

When a pool detects and issue in a file, its as easy as deleting the file to recover the pool.

Unless the corruption is in the pool itself. But the pool metadata is doubly or triply redundtly stored.

And you can use copies=2 to store two copies of everything on a single disk.

Arwen · Aug 25, 2017

NASbox said:
...
Will I still be able to access a single drive pool with a few corrupted files? (Assuming readable metadata and drive is recognized by BIOS/FreeNAS?)
...

Yes.

NASbox said:
...
What is the best way to copy BadDisk -> GoodDisk skipping/logging file names of files that have read errors?
...

As @Stux said, simply erase the offending file and you can use any normal copy tool.

Further, whence you have erased the bad file, you can revive the disk. You see, without redundancy, no file system can recover bad blocks. But, all modern disks have spare blocks, generally hundreds if not thousands of them. The way SATA, (and old IDE/PATA), disks handle bad block sparing is to write to the block. The disk knows that block is bad, so it puts the newly written, (and good), data to a spare block. Then makes a note of it for any future reads and writes. So, if your disk that experienced an error passes SMART short and long tests, and seems to have enough spare blocks, no reason not to conitue using it.

Last, as @Stux also said, you can have one dataset on the single disk pool that has copies=2 turned on. This dataset can be used for critical files, like wedding or children's photographs. While everything else can remain at copies=1.

Important Announcement for the TrueNAS Community.

Reliability of a single disk ZFS volume

NASbox

Guru

Arwen

MVP

melloa

Wizard

NASbox

Guru

Stux

MVP

NASbox

Guru

Stux

MVP

NASbox

Guru

Stux

MVP

Arwen

MVP

Similar threads

Important Announcement for the TrueNAS Community.

Reliability of a single disk ZFS volume

Guru

MVP

Wizard

Guru

MVP

Guru

MVP

Guru

MVP

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Reliability of a single disk ZFS volume"

Similar threads