You can simply use the FreeNAS "replication" also known as zfs send and receive. to make this works there are three main things to setup in FreeNAS
- Plan everything out. Because you are coordinating two separate tasks make sure you have plenty of time for the first to run. Do a few test run with just step one and ad 50% more time just to be safe. Until we have better integration for vmware backups or some sort of even hooks for running tasks, we will need to time things out manually.
- FreeNAS triggered VMware-Snapshots - This will setup a schedule for FreeNAS to log into your ESXi host or vCenter and tell every VM on the selected datastore to take a VMware snapshot. Once the VMware snapshot is done it will take a ZFS snapshot of your dataset/pool. Once the ZFS snapshot is done, it wil the VMware snapshots. NOTE: Be sure to select the ZFS dataset/pool that your VMware datastore resides on.
- FreeNAS ZFS Replication - This part can take a little bit more work depending on your desired settings but the important part here is that you check the box for "Recursively replicate child dataset’s snapshots". This is due to the fact that the FreeNAS VMware snapshots will not coordinate with the replication tasks.
The FreeNAS VMware snapshots are a great and flexible way to run backups on your VMs. There are a few obvious limitations though. The biog one I see is that you must do a full dataset/datastore at a time. This kinda makes sense due to the fact that we are ultimately just backing up a snapshot of the underlying ZFS but there may be cases where VMware snapshotting large VMs will cause noticeable performance degradation even with a full sett of VMTools and drivers installed as a VM with 64GB of RAM wired will need to write a 64GB file. This is compounded by the number of VMs on your datastore.
Best practice in VMware is to keep datastores relatively small and have fewer VMs each. This has a number of benefits in addition to keeping the number of VMware snapshot events per FreeNAS VMware snapshot task to a minimum. For example, if using iSCSI, each datastore will be a separate iSCSI session/connection allowing better use of multiple network interfaces and LAGGs. Another benefit comes in the form of more IO queues. This helps to alleviate issues with micro bursting causing IO to pause even under low throughput.
I know I got a little off the rails there at the end but I'm sure someone can use a few of those nuggets.
If any of this is incorrect, wrong, or there is a better way, I'm all ears!