KVM VM live backup (tutorial)

Joined
Jan 8, 2017
Messages
27
Would someone please be so kind to provide information, if KVM VM live backup is practically possible? If so, a tutorial would be phantastic.

For KVM, the libvirt middleware describes this process as follows: https://libvirt.org/kbase/live_full_disk_backup.html

TrueNAS SCALE Bluefin seems to include libvirt 7.0.0, so the "pull" method would apply. Under plain, debian systems, this is fully reliable.

However, this implies using the qcow2 file format. Unfortunately, TrueNAS does not seem to be geared towards implementing the qcwo2 file format (https://www.truenas.com/community/threads/advanced-vm-configuration-editing.90257/post-681611).

Quite obviously, one would want a consistent file image of each VMs disk created without stopping the VM which can be stored and copied to any file system. As much an one can bring in disk images from outside systems (https://www.truenas.com/community/t...-qcow2-format-truenas-scale.94609/post-739289), one should also be able to migrate them out.

Is this possible under current TrueNAS SCALE Bluefin and if so, how?
 
Last edited:

abbbi

Cadet
Joined
Mar 2, 2023
Messages
2
hi,

the backup function you describe (pull and push based models) depend on the qcow image format. From a middleware point of view,
libvirt creates a so called checkpoint. This makes qemu create an bitmap that is stored in the qcow images metadata to track the
changed blocks, which allows third party vendors to create incremental/differential backup.

If TrueNAS uses ZVOL as underlying storage, directly attaching the volumes as raw devices this wont work, because neither libvirt
or qemu can currently store the metadata besides the disk image, this has not (yet) been implemented.

So if you want to use that backup mode, the virtual machine must use qcow images for its disks. Im authoring a backup utility
which makes easier to use the new backup features here: https://github.com/abbbi/virtnbdbackup but obviously this wont
work with trueNAS if no qcow based disk images are used.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I am backing up VM's using Synology Active Backup and an agent inside the VM. Its a bit old school and far from ideal. But it does work so far
 
Joined
Jan 8, 2017
Messages
27
Thank you very much @abbibi!

On my regular Debian KVM hypervisor servers, I am using the libvirt tools on qcow2 images. I do agree that this will not work with TrueNAS SCALE Bluefin.

However, my aim is to hyperconverge KVM hypervisor servers and NAS. My understanding was that this would be among the features promised by TrueNAS. However, without live VM backup, hypervisors are either risky or not fully functional.

Hence, I am most interestend in a live VM backup solution that does positively work on TrueNAS SCALE Bluefin. Does this exist?? Ideally, it should be open source and provide a file that can quickly be restored to set a VM back in time.

Also thanks a lot to @NugentS

If I am not mistaken, Synology is a manufacturer of proprietary NAS systems. Does Synology Active Backup depend on using Synology hardware as a target?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
If I am not mistaken, Synology is a manufacturer of proprietary NAS systems. Does Synology Active Backup depend on using Synology hardware as a target?
Yes it does - I just happen to have one or two dotted around

What you could do is snapshot the zvol and then replicate it elsewhere. That would backup the VM.
 
Joined
Jan 8, 2017
Messages
27
Yes @NugentS - that does seem to be the way. The followup questions from my point of view are:

- Does IXSystems have such feature on the radar screen? If so, when? If so, for the community version? Would a standard qcow2 format file on a zvol not be the better solution which should be made available via the gui by IXSystems?

- As this is obviously not in the GUI, can it be implemented as a shell script run as a cron job? Does someone have a model? Does everyone need to develop independently?

- Does it make a difference, that the ordinary libvirt method get the machine to a consistent state, so that you can really just move the a .qcow2-file saved from the backup in place and start it with no issues and for all OSs (Debian, Windows Server ...)? If I am not mistaken, the snapshot would be close to a point in time but less like a systematically consistent state and more like frozen upon pulling the power plug.

I wonder if nobody or only very few users do actually use the much announced hypervisor functionality. If there are many users, do they ignore the idea of VM backups? Do they then recreate their VMs using ansible scripts, if needed? Does everyone have backup scripts as indicated by NugentS? If so, do they work well in practice? Without such features, this is not going to be competition for Proxmox and the like.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
@Michael Schefczyk Your first two questions are unclear - exactly what are you referring to?
ZVOL Snapshot and replicate - thats in the GUI - its standard ZFS and built in to truenas
qcow2 format - I can't comment - I use VMWare mostly

I have to say that I don't like the backup from inside solution that I am using for VM's on TN as restoring is difficult in the evnt of complete failure - which is why I use the snapshot and replicate mechanism as well for a total restore and the file by file for the ability to restore an individual file for when I screw up
 
Joined
Jan 8, 2017
Messages
27
Replicating snapshots to a second pool of disks and to a secondary device is what I am doing all the time with TrueNAS. My question is: Does this work as well for VMs as the good old libvirt method?
 

abbbi

Cadet
Joined
Mar 2, 2023
Messages
2
Replicating snapshots to a second pool of disks and to a secondary device is what I am doing all the time with TrueNAS. My question is: Does this work as well for VMs as the good old libvirt method?

this depends. If you take a storage snapshot while the virtual machine is running, you end up with a crash consistent backup. This means that filesystems can be inconsistent after restore. If you use storage snapshots to backup the virtual machine you must:

1) have an qemu agent installed in the virtual machine
2) freeze the filesystems of the virtual machine via agent
3) take an snapshot of the underlying storage volume
4) thaw the filesystems via agent
5) send the created storage snapshot to another device

depending on what services are running in the VM it might also be necessary to hold certain services like databases to have them
in a consistent state.

Steps 1,2,4 need to be done with pull/push based backups too to ensure consistency.
 

Belperite

Dabbler
Joined
Feb 21, 2023
Messages
26
I would say yes, as long as the VM is shut down beforehand. Taking a ZFS snapshot of the volume while the VM is running would mean the snapshot would look like the "dirty" filesystem of a system that had it's power yanked away. Consistent and atomic at the on-disk level, but not for the FS inside the VM. Some applications e.g. databases can take a very dim view of this. I don't imagine some OSs such as Windows like it either.

Libvirt does some magic to "quiesce" the VM before the backup is done and can also save RAM state, to allow the backed-up running VM to have a "clean" state.
 
Joined
Jan 8, 2017
Messages
27
Thank you @Belperite. I very much share this concern. "Live" backup as indicated in the thread title means that backup should be possible without sutting down and restarting the VM. Hence, I think that we are at a point where TrueNAS does provide hypervisor capabilities without one very critical competence (named achilles heel in another thread). It seems that IXSystems is either not aware of this or tends to ignore this.
 

Belperite

Dabbler
Joined
Feb 21, 2023
Messages
26
@Michael Schefczyk you'd have to script it I suppose. Run the appropriate libvirt commands to quiesce the VM (I believe you need the libvirt extensions installing in the VM), take the ZFS snapshot, then unquiesce. I'm not sure what you'd do about the RAM state as I've not looked into it that much, but at least the FS inside the VM should be clean.
 
Joined
Jan 8, 2017
Messages
27
Thank you @Belperite. That is probably not impossible, but far from a trivial script. In libvirt at the command line level, quiesce is integrated with "create-snapshot" or its variant "create-snapshot-as". That in turn assumes qcow2-files. There, the unquiesce step is running "under the hood" invisible to the user. Replacing the qcow2-snapshot by a ZFS snapshot would require using the api to interact with the guest agent running at the VM directly. Making that as reliable as required for backups is a substantial hurdle, I think. In fact, it would be an extension to the create-snapshot command to also accept ZFS - while dealing with the RAM issue so that everything is really "live". People like @abbbi might be abble to initiate this, but for the general admin, it would be a bit much.

If one does want a scale-out multi-node cluster - as TrueNAS seems to "promise" - the storage hierarchy might be VM-disk on Gluster on ZFS at least. Then the step to qcow2-VM-disk on Gluster on ZFS would be a very small step, I think. But that step would make it a pro virtualization cluster. Is this shared and can this be conveyed to developers?
 
Last edited:
Top