Backup Basics: via SMB or is this a 'Replication'?

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Hi
I am a fairly low tech user of TrueNAS Core. Mine consists of two mirrored 8TB drives (for a theoretical 16 in total). I have organised them as one pool with 3 datasets.

My (on a budget) plan is to buy a third 8Tb drive, mount it in an external drive enclosure and use this as backup. How should I best go about the backup? My instinct is to connect it to my Windows 10 PC and send files to it from TrueNAS via SMB, but that's my instinct because I know how to do that! I wonder though if there's a better way - one that preserves everything on the TrueNAS drives as they are now, including the ZFS filing system? Is that 'replication'? Or would I need a second TrueNAS machine to do that? Help! End of the day, it's my files that are paramount, and ZFS could be rebuilt from scratch, but as I plainly am not aware of all the issues, I wonder what the 'best' way of doing this is? Thank you for your thoughts.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
@Arwen has a resource for you:

And there's a previous thread with screenshots. Basically, you create a pool on the single external drive, create a (recursive) replication task to replicate all relevant datasets (possibly the top dataset to catch them all in one go), run this task manually, and then export the backup pool (without destroying it!!!).
 

kiriak

Contributor
Joined
Mar 2, 2020
Messages
122
I am also a home user without IT background.

I use both ways and keep 3 encrypted USB disks at my job's place. I bring them home one by one and do the backups.

I see advantages in both ways.
Replicated snapshots have all the info of my NAS datasets, the integrity checks of ZFS, are superfast and I think I forget sth.
Using a backup utility on my PC is easier for noobs like me, I see with my eyes what is to be backed up (some times is a lifesaver as I have seen erros in the setup of the backup or I realize that sth is missing from the source), is easier for my kids to find the data if I go away.

Whatever method you choose, make sure that you can restore from the backup and if the backup is encrypted share the keys with a couple of people that you trust and they do not have access to the backup disks.
 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
and then export the backup pool (without destroying it!!!).
Hi
Thank you for that/those resources. I shall read them a few times before stepping through them. I have very little idea what I'm doing so I hope on hope I don't screw up. At one point for example, @Arwen says something about 'CD' (which I know means change directory - that's about my level!) to a particular directory ending in '[USER]'. I realise it may be scandalous, but I have no idea what to substitute in USER since, when I set up TrueNAS and added various things like Syncthing and Grafana and one or two others, everything gets to have a user name. Personally, I always access TrueNAS via the GUI and am, I think, an unrestricted account. No idea what the CLI might think that is though. Hmm. I might be talking myself into using Windows via SMB here - which would be a shame, because I really want to backup the whole environment - Grafana and friends as well. My general rumination - but I did have one specific question. You end by saying "then export the backup pool (without destroying it!!!)" - can you tell me, how do I export the backup pool, and what pratfalls are to be avoided so's not to destroy anything? Note, I imagined I couple, after copying, power down and unplug my USB connected external drive, then restart. I know the USB will be painfully slow, but speed is less important than DEFINITELY having a backup.
 
Last edited:

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
I am also a home user without IT background.

I use both ways and keep 3 encrypted USB disks at my job's place. I bring them home one by one and do the backups.

I see advantages in both ways.
Replicated snapshots have all the info of my NAS datasets, the integrity checks of ZFS, are superfast and I think I forget sth.
Using a backup utility on my PC is easier for noobs like me, I see with my eyes what is to be backed up (some times is a lifesaver as I have seen erros in the setup of the backup or I realize that sth is missing from the source), is easier for my kids to find the data if I go away.

Whatever method you choose, make sure that you can restore from the backup and if the backup is encrypted share the keys with a couple of people that you trust and they do not have access to the backup disks.
Hi
I thought I did have an IT background - until I ran into TrueNAS! So please tell me, what is 'sth' which you mention a couple of times? Also, you seem to say that you 'replicate snapshots'. While I have no idea how to do that, are you saying it's a complete form of backup? If so, what advantage does it have over creating a pool on an external drive then replicating the files to it? Also, if your method is used, what would be needed to restore from it, and how can one verify the integrity of your backup (the non-Windows kind)? Thank you!
 

kiriak

Contributor
Joined
Mar 2, 2020
Messages
122
you can do the snaphot replication to a USB disk using the GUI, no CLI is needed
 

kiriak

Contributor
Joined
Mar 2, 2020
Messages
122
Hi
I thought I did have an IT background - until I ran into TrueNAS! So please tell me, what is 'sth' which you mention a couple of times? Also, you seem to say that you 'replicate snapshots'. While I have no idea how to do that, are you saying it's a complete form of backup? If so, what advantage does it have over creating a pool on an external drive then replicating the files to it? Also, if your method is used, what would be needed to restore from it, and how can one verify the integrity of your backup (the non-Windows kind)? Thank you!

I mean I have no proffesional or academic IT background

I think snapshot replication is the way to replicate my datasets to an external USB disk. Is there another way to replicate? I don't know. But the way I do it, I have my datasets and their snapshots on my USB disk.
But as I said I also use the PC backup method via the SMB shares. This is the way to go until you will be familiar the replication method.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
So please tell me, what is 'sth' which you mention a couple of times?
"Something"?
(In capitals, it would be ServeTheHome.com, and its very informative forum.)

Also, you seem to say that you 'replicate snapshots'. While I have no idea how to do that, are you saying it's a complete form of backup? If so, what advantage does it have over creating a pool on an external drive then replicating the files to it?
Replication works on snapshots, and a replication task will create an ad hoc snapshot if needed.
Irrespective of replication, periodic snapshots are essential to preserve your data against user errors and ransomware attacks. You should have a snapshot policy in place :wink:
For instance, multiple snapshot tasks creating 'auto-mypool-YYYY-MM-DD_HH-MM'
  • monthly, retained for some years;
  • weekly, retained for two months;
  • daily, retained for two weeks;
  • hourly, retained for… you get the idea.
A snapshot freezes a dataset at a point in time. As long as the snapshot is retained, the corresponding data is immutable—and could be restored if needed. A snapshot is not a backup in itself, but "replicating a snapshot" involves copying the data, so the replica on another system or drive becomes a backup.
Periodic snapshots are an incremental affair. If the destination already contains an earlier snapshot in the series, ZFS only transmits the difference—and ZFS knows the difference: it's in the snapshot metadata. So, while rsync or other backup programs would need to traverse your entire storage, your entire backup and work out, file by file, what has changed before proceeding to backup, ZFS just knows what to do.

You can do it all in the GUI: Periodic snapshots and replication task.

Also, if your method is used, what would be needed to restore from it, and how can one verify the integrity of your backup (the non-Windows kind)? Thank you!
 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Replication works on snapshots, and a replication task will create an ad hoc snapshot if needed.
Periodic snapshots are an incremental affair. If the destination already contains an earlier snapshot in the series, ZFS only transmits the difference
So am I right in thinking then that in order to have certainty that your replicated snapshots, between them, give all the data required to fully restore a system, that it's vital you have a) the very first snapshot, and b) every single intermediate snapshot, up to the latest one. A corollary of b then would be that if there was a 'hole' in the middle of the snapshot collection, where, say, one month's worth of snapshots (in a monthy snapshot cycle) were missing for any reason, then it would only be possible to backup as far as the most recent differential backup prior to the date of loss? Forgive me if I labour my question, but I can't find a clearer, less plodding way of asking this as I suspect I lack a confident vocabulary with which to discuss this differently.

Aside from seeking a better understanding of snapshots, the practical reason I ask this is that I was assuming I should delete ALL snapshots from my system, then do an immediate manual top down recursive snapshot, prior to replicating it all to a USB drive. The intention being to have as succinct a full set as possible, and to sidestep the issue where I miss 'something from the middle' of a collection, as I described above.

Lastly, assuming I can set up an automated replication (monthly sounds fine to me as my data changes very little; I might add a new film and a couple of (music) albums over the course of a month, not much more than that), and on the date and time of the next scheduled replication, either the NAS itself was powered down, or the external USB disk was powered down, how would the system handle it? Wait until the next month, then backup it's backlog and the current snapshots, or would it commence replication next time everything was powered on?

Cheers!
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Not sure if this answers your question satisfyingly but I am also not quite sure what you are trying to say.

Any snapshot of a ZFS dataset or a recursive hierarchy of datasets contains all data present exactly at the time the snapshot was taken.
So if you miss a couple of snapshots in a chain of them over a longer period of time you will simply not be able to "go back" to that particular moment.

Create file A
Snapshot A
Create file B
Snapshot B
Create file C
Delete file B
Snapshot C

If you have only snapshots A and C you will not be able to restore file B. Files A and C will be there, but they will both be in the single snapshot C, anyway.
 
Last edited:

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Any snapshot of a ZFS dataset or a recursive hierarchy of datasets contains all data present exactly at the time the snapshot was taken.
Thank you! I understand what you are describing there exactly - um, except for that opening statement. Perhaps I'm being pedantic but I'd like to tie this down if I can. You say - 'Any' snapshot? I understand snapshots to be differential after the first one, so 'any' contains only differences - except the very first 'root' snapshot. So (in my understanding) it's not true to say that 'any snapshot contains all data presentat the time it was taken'. Surely you mean 'Any, later snapshot when combined together with the original snapshot'? Thanks!
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Thank you! I understand what you are describing there exactly - um, except for that opening statement. Perhaps I'm being pedantic but I'd like to tie this down if I can. You say - 'Any' snapshot? I understand snapshots to be differential after the first one, so 'any' contains only differences - except the very first 'root' snapshot. So (in my understanding) it's not true to say that 'any snapshot contains all data presentat the time it was taken'. Surely you mean 'Any, later snapshot when combined together with the original snapshot'? Thanks!
Nope. To repeat myself: any snapshot contains all data present exactly at the time the snapshot was taken.

Try it:
Code:
zfs create pool/data/set
# write some files
zfs snap pool/data/set@time-a
# create some more files
zfs snap pool/data/set@time-b

cd /mnt/pool/data/set/.zfs/snapshot/time-a
ls -l
cd /mnt/pool/data/set/.zfs/snapshot/time-b
ls -l


The "differential" nature only comes into play when you send the snapshots to a remote machine or a different pool. In that case if snapshot @time-a is already present at the destination you can instruct zfs send to send only the difference from snapshot @time-a to snapshot @time-b to save time and bandwidth.

Still all snapshots contain all data at their time of creation.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
To be fully paedantic, snapshots do not "contain" the data; snapshots are metadata referncing the data. But "sending" or "replicating" a snapshot actually send the referenced data alongside with the metadata.
The referenced data is immutable. The metadata can actually pass through.

Let's have a dataset and fill it with 1 TB of data.
Take snapshot auto-000. It references 1 TB, and maybe uses 10 MB on disk to hold the metadata. Replicating it to a backup server will take the time to transfer 1 TB over the network.
Take auto-001 the next day. It references 1 TB and takes no space. Replicating it to backup is instantaneous: ZFS knows there was no change.
Add/modify 100 GB of data. Snapshot auto-002 references 1.1 GB and uses e.g. 1 MB (metadata to the additional data, for the rest, auto-002 refers to auto-001, which refers to auto-000). Replicating it transfers 100 GB.
Deletes 200 GB. Snapshot auto-003 references 0.9 TB and uses e.g. 2 MB to record the difference. Replicating it takes the time to transmit the 2 MB of metatdata to mean "these files are no longer in the active dataset". But all data is still on disk: It is retained by auto-002 and earlier; the pool and its backup have the whole 1.1 TB.
Delete snapshot auto-000. 'auto-001' still holds the data, and now uses 10 MB of metadata on disk: It has taken over all relevant metadata from auto-000 instead of referring to it.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
To be fully paedantic, snapshots do not "contain" the data; snapshots are metadata referncing the data.
The filesystem abstraction on top of the snapshot as presented in pool/data/set/.zfs/snapshot contains all the data. Better? :wink:
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
In my view the first order of business for backups is simplicity. Then comes nothing for a very long time.

You need to be able to explain your backup approach to someone who wakes you up in the middle of the night after a really good party. That's how well you should have understood and internalized things. Because when things go south you are under tremendous stress and very likely to make mistakes you would normally never do. To reduce that risk is the biggest factor for your success to get things up and running again.

Because the critical part is not the backup but the restore. My gut feeling tells me that for starters I would recommend using SMB to an encrypted USB hard disk with an easy-to-use backup software.

And don't forget that you should have at least 2 external disks, in case one gets fried together with your NAS due to a lightning strike while just performing the backup.
 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Still all snapshots contain all data at their time of creation.
I see. Thank you for explaining that so cogently! So later snapshots are just descriptions of change while they reside on the origin machine (hence being small files), but transition into not mere descriptions but embodiment of that change (and large files), when copied to an external machine. Ingenius!
Thank you Patrick!!
 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
[Everything you said] @ChrisRJ
That's my thoughts nailed exactly! Thank you.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Because the critical part is not the backup but the restore. My gut feeling tells me that for starters I would recommend using SMB to an encrypted USB hard disk with an easy-to-use backup software.
Which easy to use backup software backs up a TrueNAS server to a USB hard disk via SMB? I do not understand your use case. I think we are considering TrueNAS? You place all your data from your Windows systems or Macs on your TrueNAS and you want to backup THAT. No?
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
@Patrick M. Hausen , one of my TrueNAS backups goes to a USB disk (I actually rotate 3) that is connected to a Windows client. I use SyncBack Pro there, which has worked well for me for a long time.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Waitaminute ... TrueNAS --> USB. OK. But what is SyncBack Pro? That runs on TrueNAS? And ... erm ... no ...

OK, so guessing:

A Windows machine is running this software, reading files from TrueNAS via SMB and writing them to a USB device connected to that Windows machine?

And this is better than having backups done by the TrueNAS server which is on 24x7 anyway, and has plenty of cycles to spare without any particular client system having to be up and running ... how exactly?

I want my backups to be set and forget and running automatically. SSD pool with jails and VMs to HDD pool. SSD pool with jails and VMs to HDD pool on a second system in a remote location. That's what ZFS is for :wink:

Seriously, I would never rely on a client system to perform a task that is best done by the server itself. All our >100 hosting machines in Frankfurt and near Nuremberg backup a nightly snapshot of all customer jails to a large storage system in Helsinki. And if that procedure fails for any jail for any reason I have an alarm in the morning.

I apply the same line of thinking to my private backups. Mac dumps everything on TrueNAS. TrueNAS takes care ... me no worry.
 
Top