Some basic questions on TrueNAS replications

jms123 · Jan 21, 2023

Apologies for the basic questions but I have done a fair bit of reading etc. and I cannot seem to find definitive answers one way or the other so thought I would post on here.

I have a hyper-v cluster with a TrueNAS backend iSCSI share and am testing replication to another TrueNAS server and have some basic questions about replication and snapshots.

The pool is 1TB in size and the zvol is approx 800GB in size (based on leaving 20% free) and from within Failover Cluster Manager the reported usage of the CSV is approx 400GB, so about half of the zvol.

So when I do the first replication a snapshot is going to be created to replicate to the backup TrueNAS server -

1) is the size of the first snapshot going to be approx 400GB ie. the actual disk usage of the zvol

2) where is this snapshot stored ie. within the zvol or outside because if outside there is obviously not enough space

3) Can you delete snapshots on the source TrueNAS server once they have been sent to the backup server

Based on the fact that we use dynamic disk in hyper-v I am beginning to think using TrueNAS replication is not going to work for us without a lot of management and perhaps a replica cluster using hyper-v replication may be better but if anyone has experience of this would appreciate any insights.

Thanks

jgreco · Jan 21, 2023

jms123 said:
based on leaving 20% free

I get the feeling you didn't run across the guide on how to do this.

The path to success for block storage

It seems like I haven't written a sticky for awhile, but just in the last week I've had to cover this topic several times. ZFS does two different things very well. One is storage of large sequentially-written files, such as archives, logs, or data files, where the file does not have the middle...

www.truenas.com

Also, please provide some more details about your setup, including a hardware manifest, because there's a lot of unanswered questions here.

Whattteva · Jan 21, 2023

jms123 said:
1) is the size of the first snapshot going to be approx 400GB ie. the actual disk usage of the zvol

Snapshots aren't a backup. They just represent the state of the system at that exact point in time. As such, they are very lightweight and do not take up any space so long as the underlying data has not been mutated. So a fresh snapshot will take no space. Obviously, as you use the system, things will start to diverge and the snapshots will start taking up space from that point on.

jms123 said:
2) where is this snapshot stored ie. within the zvol or outside because if outside there is obviously not enough space

As I've said earlier, it takes no space initially.

jms123 said:
3) Can you delete snapshots on the source TrueNAS server once they have been sent to the backup server

Yes you can. However, you will want to keep at least 1 common snapshot that exists on both systems so your future backups will be incremental and lightweight instead of full system replications.

jms123 said:
Based on the fact that we use dynamic disk in hyper-v I am beginning to think using TrueNAS replication is not going to work for us without a lot of management and perhaps a replica cluster using hyper-v replication may be better but if anyone has experience of this would appreciate any insights.

Not actually sure what you're trying to ask here.

jms123 · Jan 21, 2023

No I had not seen that and it is very helpful thanks.

We are using multiple mirrored vdevs as recommended but if I am understanding you are you saying when we created the zvol to share out we should only have used 50% of the pool of vdevs ?

In terms of hardware we are using a Dell 720 with 2 TB SSDs in multiple vdevs as mentioned.

I appreciate the help but from reading the link (and I only read your initial sticky so far) I am not seeing anything relating to replication and snapshots ?

jms123 · Jan 21, 2023

So a fresh snapshot will take no space. Obviously, as you use the system, things will start to diverge and the snapshots will start taking up space from that point on.

I don't really follow that. In my example the zvol has already got approx 400GB of data in it so surely the first snapshot cannot be empty otherwise what am I replicating to the backup server ?

Did you mean the snapshot would be empty on a newly created pool/zvol ?

I am just trying to understand with the initial snapshot on an existing zvol with data already in it would the snapshot be the size of the data in the zvol ?

Whattteva · Jan 21, 2023

jms123 said:
I don't really follow that. In my example the zvol has already got approx 400GB of data in it so surely the first snapshot cannot be empty otherwise what am I replicating to the backup server ?

Snapshots itself don't take anything. When you're replicating it to another server, you're just transferring the initial chunk of data (which in your case, yes 400 GB). However, the snapshot itself still will take nothing on the backup server initially. As you work and write/delete stuff, you will notice that the snapshot will start to grow and take up more space. Basically, a snapshot just keeps track on what has been changed from the initial point of reference.

jms123 said:
Did you mean the snapshot would be empty on a newly created pool/zvol ?

The replicated server would just be an identical copy of what you have currently on your original server, which is the initial chunk of data/files plus a snapshot, which initially takes up no space.

jms123 said:
I am just trying to understand with the initial snapshot on an existing zvol with data already in it would the snapshot be the size of the data in the zvol ?

No, a snapshot is the size of the DIFFERENCE between what was in the pool at the point the snapshot was taken and what is currently in the pool after the snapshot was taken.

This distinction is what makes snapshots very lightweight and near instant to generate and what also enables future backups to be incremental and very fast.

jms123 · Jan 21, 2023

Whattteva said:
Snapshots itself don't take anything. When you're replicating it to another server, you're just transferring the initial chunk of data (which in your case, yes 400 GB). However, the snapshot itself still will take nothing on the backup server initially. As you work and write/delete stuff, you will notice that the snapshot will start to grow and take up more space. Basically, a snapshot just keeps track on what has been changed from the initial point of reference.

The replicated server would just be an identical copy of what you have currently on your original server, which is the initial chunk of data/files plus a snapshot, which initially takes up no space.

No, a snapshot is the size of the DIFFERENCE between what was in the pool at the point the snapshot was taken and what is currently in the pool after the snapshot was taken.

This distinction is what makes snapshots very lightweight and near instant to generate and what also enables future backups to be incremental and very fast.

Thanks for that and it does make sense but it does not tie in with what I saw in my test lab.

So I had 1 TB pool and 800 GB zvol which leaves approx 200GB free (outside of the pool). I first created small VMs so 2 VMs = approx 50GB and I enabled replication and it replicated fine and created snapshot. I then remapped the iSCSI portals in hyper-v so the cluster was using the backup server and the VMs were fine and had the data in them I expected.

I then reset the lab and created the same pool and zvol on the backup server but this time created 2 much larger VMs - approx 150GB each so 300GB in total.

When I tried to replicate it came back and said it couldn't because there was not enough storage space available but within the zvol there clearly was as none of it had been used at that time.

Which is why I was asking about the size of the initial snapshot and where it is stored because I couldn't work out why there wasn't enough space.

I understand that snapshots are in effect incrementals but made the assumption that just the first snapshot was in effect a copy of the data and that it was not stored within the zvol which is why it said it did not have enough space to replicate.

Whattteva · Jan 21, 2023

jms123 said:
Thanks for that and it does make sense but it does not tie in with what I saw in my test lab.

So I had 1 TB pool and 800 GB zvol which leaves approx 200GB free (outside of the pool). I first created small VMs so 2 VMs = approx 50GB and I enabled replication and it replicated fine and created snapshot. I then remapped the iSCSI portals in hyper-v so the cluster was using the backup server and the VMs were fine and had the data in them I expected.

I then reset the lab and created the same pool and zvol on the backup server but this time created 2 much larger VMs - approx 150GB each so 300GB in total.

When I tried to replicate it came back and said it couldn't because there was not enough storage space available but within the zvol there clearly was as none of it had been used at that time.

Which is why I was asking about the size of the initial snapshot and where it is stored because I couldn't work out why there wasn't enough space.

I understand that snapshots are in effect incrementals but made the assumption that just the first snapshot was in effect a copy of the data and that it was not stored within the zvol which is why it said it did not have enough space to replicate.

You need to be more specific. How big is the pool on the backup server and how much is already on it?

jms123 · Jan 21, 2023

Whattteva said:
You need to be more specific. How big is the pool on the backup server and how much is already on it?

In my tests the pool was the same size ie. 1 TB and the zvol was the same size - 800 GB.

There was nothing on the pool/zvol at all, I had created it fresh so no space used.

Based on the not enough space I assumed that either -

1) there was not enough space on the source server to create the snapshot (assuming the first snapshot is a copy)

or

2) there was not enough on the backup server.

I am now wondering if I should set the lab up again just so I can post the exact error message as I still have the test VMs.

jms123 · Jan 21, 2023

Just set up lab again and get same error message which is -

"Failed to snapshot TestVPS-Pool/TestVPS-Zvol@auto-2023-01-21_20-33: out of space"

The pool and zvol on the backup server are new so nothing is being used.

I have an approx 711 GB zvol on the prod TrueNAS and hyper-v is saying of that space approx 300 GB has been used.

Whattteva · Jan 21, 2023

Ah I see, so unfortunately, this is a major disadvantage of ZVOL that makes snapshots behave this way as it doesn't work this way on datasets. For this reason, I usually use the normal VM disk image files instead of ZVOL's cause those enable you to take a lightweight snapshot of the dataset as normal.

jms123 · Jan 21, 2023

Ok thanks, but as I understand it I have to use a zvol if I want to use iSCSI for the cluster connectivity.

I am also still not sure why I don't have enough space because it isn't clear to me where the snapshots are being stored, if within the zvol there is indeed enough space.

Do you (or anyone else) know of any good documentation on zvol snapshots ie. how they behave, where they are stored etc. as I cannot at the moment understand how it is working it out.

jgreco · Jan 21, 2023

jms123 said:
I am also still not sure why I don't have enough space because it isn't clear to me where the snapshots are being stored,

Snapshots are not "stored". Without being totally technically accurate here, think about it like this: a block in ZFS can be used by one or more consumers, just like when you use a UNIX hardlink, where you have two or more filenames pointing at the same file contents (which therefore takes no additional space for the second filename and beyond).

When you take a snapshot, ZFS does a clever thing where it assigns the current metadata tree for the dataset (or zvol in your case) to a label. This happens almost instantaneously, because it's a very easy operation. It doesn't make a copy of the data. It just lets it sit where it was. However, because ZFS is a copy-on-write filesystem, when you write a NEW block to the zvol, a new block is allocated, the OLD block is not freed (because it is a member of the snapshot), and the metadata tree for the live zvol is updated to accommodate the new block. NO changes are made to the snapshot, which remains identical to the way it was when the snapshot was taken.

So it is really data from the live zvol which is "stored", and when you take a snapshot, it just freezes the metadata view of the zvol. You can then read either the live zvol or any snapshot you'd prefer. If this sounds like a visualization nightmare for the metadata, ... well, yeah.

When you destroy a ZFS snapshot, the system will then free blocks to which no other references exist.

jms123 · Jan 21, 2023

I may well be misunderstanding this but you seem to be saying snapshots are simply metadata and do not contain the actual contents of the blocks which makes me think I am not really understanding how the replication works. If I set the replication to run hourly are two things happening every hour -

1) an actual copy of the new data blocks that have been made since the last replication are sent to the backup server

and

2) a snapshot of the current metadata is also sent to the backup server which is just the view of the zvol on the primary server at the time the replication ran

whereas I assumed everything was contained in the snapshot itself.

All of that said I still cannot understand why with a zvol on the primary server of approx 700 GB size and 300 GB used (as reported by hyper-v) I cannot replicate to the same size zvol on the backup server with absolutely no data in at all.

Sorry to keep asking the same questions and this thread has been very helpful so far but I still feel I am not really understanding what is going on :)

garm · Jan 21, 2023

Why Do Snapshots Increase in Size?

I don't understand why snapshots increase in size. (I'm reading through the User Guide and I'm only as far as the introduction, but it is suggesting I read the ZFS Primer -- maybe I shouldn't worry about this right now and keep on plugging away?) Let's make this simple. Let's say my pool is...

www.truenas.com

jms123 · Jan 21, 2023

garm said:
Why Do Snapshots Increase in Size?

I don't understand why snapshots increase in size. (I'm reading through the User Guide and I'm only as far as the introduction, but it is suggesting I read the ZFS Primer -- maybe I shouldn't worry about this right now and keep on plugging away?) Let's make this simple. Let's say my pool is...

www.truenas.com

So when you say "you have a snapshot worth 5 MB" do you mean you have a snapshot that points to 5 MBs worth of blocks as opposed to the snapshot actually containing the contents of the blocks as I understand the actual snapshots themselves are just metadata ?

If so I can understand how keeping a lot of snapshots can eat up your disk as you can't free up unused blocks that are referenced within snapshots.

jgreco · Jan 24, 2023

jms123 said:
I may well be misunderstanding this but you seem to be saying snapshots are simply metadata and do not contain the actual contents of the blocks

Well, that's sorta true. The data blocks are just blocks that reside on the pool. When you "modify" a block, a new block is actually written, along with updated metadata, and the old data block and its supporting metadata is there too. It's just that the old metadata shows the "old" data block as a member of the "old" snapshot, while the metadata shows the "new" data block in the live view. There isn't some separate holding area for "snapshot" data.

From a certain point of view, it isn't really that much different from when you go into a directory. Why do your files from directory "a/" not show up in "b/"? Because of the metadata. But it's all stored on the shared disk space.

The trick here is that ZFS is doing the snapshotting with what is essentially fancy bookkeeping in the metadata, which takes excellent advantage of one of the sometimes annoying properties of ZFS - the CoW stuff. So when you take a snapshot, the delta between the snapshot and the live image starts at "no difference", but as you write, delete, or modify files, those structural changes get pushed out as new metadata blocks. The data blocks themselves are written out as ordinary blocks, but if you can imagine them as being "written in directory b/ instead of a/", then you might be able to picture the workings a bit better.

winnielinnie · Jan 24, 2023

If you're a visual person, such as myself (curse the rest of this analytical world!), then perhaps this might help. Remember that a "snapshot" is in fact a read-only filesystem at the exact moment in time that the snapshot was taken.

GREY SQUARES represent unused space

WHITE SQUARES represent ZFS records

* Even though all squares are the "same size", it's meant to simplify things for the sake of illustration. ZFS records can differ in size. What's not shown in these examples are other datasets, which also share the same storage space that the pool provides. This illustration doesn't show the "current live filesystem". Destroying all snapshots does not destroy the live filesystem, of course. So how much space will be used up in your dataset will depend on the records that still exist in the live filesystem minus any records that only existed in (destroyed) snapshots.

You take three snapshots after creating, saving, modifying, and deleting data

Here's what happens if you destroy snapshot1

Here's what happens if you destroy snapshot2

Here's what happens if you destroy snapshot3

Here's what happens if you destroy snapshot1 and snapshot2

Here's what happens if you destroy snapshot1 and snapshot3

Here's what happens if you destroy snapshot2 and snapshot3

Important Announcement for the TrueNAS Community.

Some basic questions on TrueNAS replications

Dabbler

Resident Grinch

Wizard

Dabbler

Dabbler

Wizard

Dabbler

Wizard

Dabbler

Dabbler

Wizard

Dabbler

Resident Grinch

Dabbler

Wizard

Dabbler

Resident Grinch

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Some basic questions on TrueNAS replications"

Similar threads