Corruption on back-up and rsync

oguruma · Sep 14, 2021

I am curious how silent corruption on a backup source is affected by Rsync-based backup tasks.

Suppose your RAID-Z NAS backs up to another JBOD NAS. On the JBOD Backup a file gets corrupted. Does this corruption get fixed at the next rsync backup?

jgreco · Sep 14, 2021

Only if you run rsync in checksum ("-c") mode, as rsync needs to read the entire file on both sides to compare the blocks. Normally, rsync is only looking at size and modification time of the file, which will not capture corruption. That of course is nice and fast. Checksum mode is inherently very slow and I/O intensive.

oguruma · Sep 14, 2021

jgreco said:
Only if you run rsync in checksum ("-c") mode, as rsync needs to read the entire file on both sides to compare the blocks. Normally, rsync is only looking at size and modification time of the file, which will not capture corruption. That of course is nice and fast. Checksum mode is inherently very slow and I/O intensive.

For, say, 24TB of data and a nightly backup, would checksum mode really make any meaningful difference in terms of IO ops' effect on the rest of the system?

jgreco · Sep 14, 2021

Depends on what you're doing with the system, I guess.

If you had a JBOD NAS with 12x drives, that might give you a theoretical maximum of about 12*200MBytes/sec = 2.4GBytes/sec I/O, which suggests you could possibly read a 24TB dataset in 10000 seconds, or an eighth of a day. But the fact of the matter is that most systems do not read at anywhere near their theoretical maximum, so if it was only reading at one eighth of that, you would be hard pressed to complete one rsync cycle before starting the next.

oguruma · Sep 14, 2021

jgreco said:
Depends on what you're doing with the system, I guess.

If you had a JBOD NAS with 12x drives, that might give you a theoretical maximum of about 12*200MBytes/sec = 2.4GBytes/sec I/O, which suggests you could possibly read a 24TB dataset in 10000 seconds, or an eighth of a day. But the fact of the matter is that most systems do not read at anywhere near their theoretical maximum, so if it was only reading at one eighth of that, you would be hard pressed to complete one rsync cycle before starting the next.

I see... My photo collection is growing and I am thinking of more economical ways to back up to spinning rust that doesn't require redundancy on the backup side, but also trying to avoid silent corruption on the that side as well.

As of now, I back 18ish TB of data to a separate RAID-Z NAS.

jgreco · Sep 14, 2021

oguruma said:
I see... My photo collection is growing and I am thinking of more economical ways to back up to spinning rust that doesn't require redundancy on the backup side, but also trying to avoid silent corruption on the that side as well.

As of now, I back 18ish TB of data to a separate RAID-Z NAS.

Thanks, some context is extremely helpful.

In general, rsync is insanely frigging fast at crawling a filesystem tree and doing date/size based comparisons.

You might want to consider writing a little script that runs from cron, perhaps two of them that do mutex locking, one that does your "regular and very frequent rsync" to back up data, and then one that runs weekly to validate the pool. You probably don't want these to run at the same time, and you want to serialize their running, so you could do something like

#! /bin/sh -

lockf -t 0 /tmp/mylockfile rsync -a <src> <dst>

and then

#! /bin/sh -

lockf -t 0 /tmp/mylockfile rsync -ac <arc> <dst>

And call these scripts from cron. Now this isn't really intended as a fully featured solution, just an example of a technique...

winnielinnie · Sep 14, 2021

Wouldn't corruption of a particular image file be detected by the other NAS assuming it uses ZFS (scheduled scrubs, attempting to read the data, etc)?

Important Announcement for the TrueNAS Community.

Corruption on back-up and rsync

oguruma

Patron

jgreco

Resident Grinch

oguruma

Patron

jgreco

Resident Grinch

oguruma

Patron

jgreco

Resident Grinch

winnielinnie

MVP

Similar threads

Important Announcement for the TrueNAS Community.

Corruption on back-up and rsync

oguruma

Patron

jgreco

Resident Grinch

oguruma

Patron

jgreco

Resident Grinch

oguruma

Patron

jgreco

Resident Grinch

winnielinnie

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Corruption on back-up and rsync"

Similar threads