USB Drive as offline backup

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
In practical terms, it's unlikely to get to that extreme, since there is functionality that depends on having an NTFS driver (importing data).
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
The only supported file system for TrueNAS is ZFS
Since support for NTFS (and possibly extFS) is built into the kernel that TrueNAS Scale uses, I think it is safe to assume that it will remain there for a very long time.

I can't stress enough that I'm not interested in hearing about ZFS as it isn't relevant in writing to my NTFS formatted USB drive.

I'm seeking information on why writing to my backup drive is glacial, not how wonderful (or flawed) ZFS is. Please stick to the subject.
 

kiriak

Contributor
Joined
Mar 2, 2020
Messages
122
I use snapshot replication to a USB HDD (ZFS and encrypted) so I have a complete backup of my data and also the previous states of them PLUS the benefits of ZFS like data verification.

I also make a backup of my data through a windows backup program to an NTFS also encrypted USB HDD, just in case I messed the replication procedure and for an easy to read backup for my kids in case I go away.

Both are kept offsite.
I think it is the best backup solution for home users if a second offsite backup server is not available.

As far as data verification with hash of the files using third party utilities, I have been there, too much work to do it right, it was for me the triggering point to go to Synology (for the BTRFS) an afterwards to come here to the TN and ZFS.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
As far as data verification with hash of the files using third party utilities, I have been there, too much work to do it right, it was for me the triggering point to go to Synology (for the BTRFS) an afterwards to come here to the TN and ZFS.
Hashing is a feature of rsync so it probably isn't fair to claim that it is "too much work" to get some level of certainty.
 

kiriak

Contributor
Joined
Mar 2, 2020
Messages
122
Hashing is a feature of rsync so it probably isn't fair to claim that it is "too much work" to get some level of certainty.
By hashing I mean verifying the data integrity of my source data and also my backups.
Many backup methods do verification using hash.
But what about corruption before and after the backup procedure? There it was the "too much work" that now ZFS does for me.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
By hashing I mean verifying the data integrity of my source data and also my backups.
My source data is on a ZFS volume. I'm only interested in whether the backed-up data has the same hash as the source.

The integrity of the data has absolutely nothing to do with my question. My question is why my backups are taking so long.

Can we concentrate on my question?
 

kiriak

Contributor
Joined
Mar 2, 2020
Messages
122
it has to do with integrity of the backup data and speed of backup also,
since you wrote
I can't imagine why one would use ZFS on a USB drive.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
If I remember correctly, ALL foreign file systems are being deprecated in TrueNAS. That would include NTFS and the "import data" function. Thus, the reason why I assumed ZFS on the external drive, and persisted with that thought until @harsh was clear about using NTFS.

The reason "import data" is being deprecated, (again IF I REMEMBER CORRECTLY), is that file attributes are not fully maintained. Thus, any attempt to use the files afterwards in a share would require manual intervention to potentially change owner, group, permissions and or ACLs.

Now whether deprecating "import data" is a good decision or bad, is beyond my knowledge.


If I am wrong about the deprecating of "import data" function, feel free to correct me. I will not be offended for being wrong.
 

CJRoss

Contributor
Joined
Aug 7, 2017
Messages
139
The only supported file system for TrueNAS is ZFS; plan for any others to be turned off at some point and locked out; if you want to use a backup drive formatted with another file system I would recommend setting up a workflow where you connect the drive to some other machine and transfer the data via a network protocol. The other option is a single device pool that is formatted with ZFS, this is what I use for my external backup.

I would expect that one could consider something like ext4 to be supported nearly indefinitely, but I'll agree that windows based file systems would be a gamble and not something to risk.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
If I remember correctly, ALL foreign file systems are being deprecated in TrueNAS. That would include NTFS and the "import data" function. Thus, the reason why I assumed ZFS on the external drive, and persisted with that thought until @harsh was clear about using NTFS.

The reason "import data" is being deprecated, (again IF I REMEMBER CORRECTLY), is that file attributes are not fully maintained. Thus, any attempt to use the files afterwards in a share would require manual intervention to potentially change owner, group, permissions and or ACLs.

Now whether deprecating "import data" is a good decision or bad, is beyond my knowledge.


If I am wrong about the deprecating of "import data" function, feel free to correct me. I will not be offended for being wrong.
I have used the import feature from a NTFS formated drive and some files or folder couldn't be migrated to ZFS due to some special characters in the name. I have given up on the use case after that as not being reliable.

By hashing I mean verifying the data integrity of my source data and also my backups.
My source data is on a ZFS volume. I'm only interested in whether the backed-up data has the same hash as the source.

The integrity of the data has absolutely nothing to do with my question. My question is why my backups are taking so long.

Can we concentrate on my question?
Probably because you are not using ZFS replication. Though, to be fair, Most of my iocage root dataset crawl to a stop during replication.
If you are using rsync, each files being copied need to be read in order to calculate the hash. So the more data you have, the more time it will take. This has a compounding effect, which doesn't exist with ZFS when performing incremental replication.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
An update on things I've tried to obtain acceptable performance:

I stopped the rsync in process, remounted the NTFS formatted USB drive using "mount -t ntfs /dev/sdi2 <dest>" instead of "ntfs-3g /dev/sdi2 <dest>" and the speed has picked up by a factor of 17 (now just under 24MBps). Still not what I was hoping for, but it will cut days off the backup.

I found this suggestion elsewhere because most seem to think that discussing ZFS ad nauseam will be the answer.

Until better information surfaces, this may help others in a similar situation.
 

CJRoss

Contributor
Joined
Aug 7, 2017
Messages
139
An update on things I've tried to obtain acceptable performance:

I stopped the rsync in process, remounted the NTFS formatted USB drive using "mount -t ntfs /dev/sdi2 <dest>" instead of "ntfs-3g /dev/sdi2 <dest>" and the speed has picked up by a factor of 17 (now just under 24MBps). Still not what I was hoping for, but it will cut days off the backup.

I found this suggestion elsewhere because most seem to think that discussing ZFS ad nauseam will be the answer.

Until better information surfaces, this may help others in a similar situation.

Are the speeds you're seeing during the scan or transfer portion of rsync? Also, what rsync command are you using?

One of the benefits of using zfs snapshots instead is that you wouldn't need to do the comparison scan and instead would just have the transfer speed. I'm guessing a good part of the problem you're running into is all of the random access as rsync attempts to determine what's changed in order to update the drive.

I've never tried it, but you should be able to send a zfs snapshot to a file on the usb drive.

Code:
zfs send pool/fs@snap | gzip > backupfile.gz


That should give you higher speeds via a sequential write. However, I'm not sure entirely how that works for restore. You might have to provide every snapshot file in order to get a full restore. Like I said, not something I've ever tried.

 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
Are the speeds you're seeing during the scan or transfer portion of rsync?
During the actual writing of files to the USB drive. There are no writes during the scan phase and no existing files on the destination to read.
Also, what rsync command are you using?
rsync -acv
One of the benefits of using zfs snapshots instead is that you wouldn't need to do the comparison scan and instead would just have the transfer speed. I'm guessing a good part of the problem you're running into is all of the random access as rsync attempts to determine what's changed in order to update the drive.
There's nothing "changed". The files aren't on the destination. Mine is the worst-case scenario for an incremental backup approach. Have you ever tried to do a ground-up restore from incremental backups?
I've never tried it, but you should be able to send a zfs snapshot to a file on the usb drive.
I want an archival backup that isn't tied to Truenas or ZFS. I may decide to go another direction if the performance will be that poor on something as simple as a full backup. I'm pretty sure I can get much better archival performance on other operating systems as I archived this server once before using a similar setup when it was running Windows Server 2016 and it went a lot faster. My goal here is to try to get Truenas Scale to at least approach that level of performance.

I've checked the CPU usage and it is barely an idle.

Please, please, please -- no more theories about ZFS on the destination. If I can't figure out how to make a portable backup that doesn't take weeks, my Truenas experiment will be over. I've been stuck with backups that I couldn't read on other systems before (previously because of tape formats) and I felt like all my backup efforts were wasted.

I'm contemplating doing some testing with a different USB adapter that might have better support from Truenas Scale if it is a driver issue.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
During the actual writing of files to the USB drive. There are no writes during the scan phase and no existing files on the destination to read.

rsync -acv

There's nothing "changed". The files aren't on the destination. Mine is the worst-case scenario for an incremental backup approach. Have you ever tried to do a ground-up restore from incremental backups?

I want an archival backup that isn't tied to Truenas or ZFS. I may decide to go another direction if the performance will be that poor on something as simple as a full backup. I'm pretty sure I can get much better archival performance on other operating systems as I archived this server once before using a similar setup when it was running Windows Server 2016 and it went a lot faster. My goal here is to try to get Truenas Scale to at least approach that level of performance.

I've checked the CPU usage and it is barely an idle.

Please, please, please -- no more theories about ZFS on the destination. If I can't figure out how to make a portable backup that doesn't take weeks, my Truenas experiment will be over. I've been stuck with backups that I couldn't read on other systems before (previously because of tape formats) and I felt like all my backup efforts were wasted.

I'm contemplating doing some testing with a different USB adapter that might have better support from Truenas Scale if it is a driver issue.
You stated in your earlier posts you are using a Dell server with the USB drive attached to it, also you said you are also using a desktop. So I would suggest you experiment with rsync only between the server and the desktop internal drive over LAN. This way you exclude the USB interface altogether. Let's see what kind of performance you can get.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
So I would suggest you experiment with rsync only between the server and the desktop internal drive over LAN.
That's my next step. My power is going to be out tomorrow morning so I don't want to interrupt the progress of the current rsync session.

To truly determine if TrueNAS' USB driver support is the problem, I should hook the same USB drive up to a desktop computer and compare that with the speeds of transfer to a local drive. In any event, remain hopeful that there's a better way that isn't going to require additional computers and mass quantities of LAN bandwidth.
 

CJRoss

Contributor
Joined
Aug 7, 2017
Messages
139
During the actual writing of files to the USB drive. There are no writes during the scan phase and no existing files on the destination to read.

rsync -acv

There's nothing "changed". The files aren't on the destination. Mine is the worst-case scenario for an incremental backup approach. Have you ever tried to do a ground-up restore from incremental backups?

I want an archival backup that isn't tied to Truenas or ZFS. I may decide to go another direction if the performance will be that poor on something as simple as a full backup. I'm pretty sure I can get much better archival performance on other operating systems as I archived this server once before using a similar setup when it was running Windows Server 2016 and it went a lot faster. My goal here is to try to get Truenas Scale to at least approach that level of performance.

I've checked the CPU usage and it is barely an idle.

Please, please, please -- no more theories about ZFS on the destination. If I can't figure out how to make a portable backup that doesn't take weeks, my Truenas experiment will be over. I've been stuck with backups that I couldn't read on other systems before (previously because of tape formats) and I felt like all my backup efforts were wasted.

I'm contemplating doing some testing with a different USB adapter that might have better support from Truenas Scale if it is a driver issue.

rsync still needs to scan all of the files before it starts copying them, especially if it's calculating checksums. That's why I asked.

Also, my suggestion didn't involve ZFS at the destination. I was just pointing out a possibly faster way to backup via snapshot to a file.

Have you tried testing rsync from one pool to another within TrueNAS? Hopefully you can easily add a single drive to do so. That way you can determine the maximum rsync performance with no other variables such as usb, network, etc.

I know that I've had rsync take a while when doing an initial copy and that was ext to ext. I've also run into problems compressing and uncrompressing to/from usb drives, both with ext and ntfs.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
Also, my suggestion didn't involve ZFS at the destination. I was just pointing out a possibly faster way to backup via snapshot to a file
It would defeat the purpose of making a portable archive of the files if I had to build a ZFS capable machine to read an all-inclusive snapshot.
Have you tried testing rsync from one pool to another within TrueNAS? Hopefully you can easily add a single drive to do so.
This also defeats the purpose as it produces a ZFS formatted hard drive. Portability is an absolute must when archiving.
I know that I've had rsync take a while when doing an initial copy and that was ext to ext. I've also run into problems compressing and uncrompressing to/from usb drives, both with ext and ntfs.
Whether it is rsync or any other application, it is going to take a while to do a full dump. The issue here is that "while" was amounting to weeks.
 

CJRoss

Contributor
Joined
Aug 7, 2017
Messages
139
It would defeat the purpose of making a portable archive of the files if I had to build a ZFS capable machine to read an all-inclusive snapshot.

This also defeats the purpose as it produces a ZFS formatted hard drive. Portability is an absolute must when archiving.

Whether it is rsync or any other application, it is going to take a while to do a full dump. The issue here is that "while" was amounting to weeks.

Please take five minutes to go back and read what I wrote instead of assuming my intent. I'll restate it again since you seen to have missed it.

Have you tried testing rsync from one pool to another within TrueNAS? Hopefully you can easily add a single drive to do so. That way you can determine the maximum rsync performance with no other variables such as usb, network, etc.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
The normal way I use RSync is;

rsync -aAHSXxv --delete --stats SOURCE DESTINATION

Using "c" causes excessive reads because RSync will need to read each and every file, on BOTH sides, before determining via checksum what blocks in a file to update. This may not mater when the source & destination are separated by a slow network link.

A better method is to use size plus date & time stamp as I've shown above. If the source & destination file have the same size, plus date & time stamp, then skip. The caveat is that the both source and destination need to use the same date & time stamping. I've had trouble with FAT32 date & time stamps.

Last, the options I use for RSync are for *nix source and destinations. A NTFS destination may do things differently.
 
Top