Random nesting behaviour for encrypted replication tasks

rusty6285

Cadet
Joined
Feb 21, 2023
Messages
5
Hi all, this is an odd problem to articulate but wonder if anyone can help with this issue. I am on TrueNAS-SCALE-22.12.0

To set the stage:
  • I have 3 pools across 3 different groups of physical drives, and each pool is encrypted using native TrueNAS encryption (passkeys)
  • I have 3 replication tasks intended to sequentially backup data from 1 pool to another. Each use their own snapshots and do not overlap in their execution time, as that was a troubleshooting step I took for the issue at hand.
    • 1. Pool1's unique data is backed up to Pool2 in a dataset called 'Pool1 backups'
    • 2. Pool1's unique data is backed up to Pool3 in a dataset called 'Pool1 backups'
    • 3. Pool 2's unique data is backed up to Pool3 in a dataset called 'Pool2 backups'
  • The unique data in Pool1 is in a dataset that is further encrypted with native TrueNAS encryption, but using a passphrase. The 'PoolX backup' datasets in pools 2 and 3 are also encrypted with passphrases to receive the data. This is to ensure that all sensitive data is encrypted with a passphrase, regardless of what drive pool it exists on. Replications are unchecked for "Include Dataset Properties" to try reduce potential issues with encryption.
    • With the above encryption in place, when I restart TrueNAS, I must unlock the 3 passphrase datasets to allow the replications to function as expected. This is my desired level of security - I don't want these datasets auto-unlocking on boot.

Now the problem I am experiencing... this will all function perfectly for a number of days at a time, but then I will wake up to a 'random reboot' alert. Of course, the datasets are re-locked as I would expect, but when reviewing the logs, I actually see multiple reboots at exactly the times the replications tried to subsequently run in these locked conditions.

I also observe that the 'Pool1 backups', when they fail in this manner, they create another 'Pool1 backups' dataset as a child to the locked 'Pool1 backups' on the destination. These datasets contain no data and give error messages when I click on them. The only way I can resume normal operation is to delete these newly created child datasets, unlock the drives and then everything works again for a few days.

My system configuration is in my signature below but please let me know if I can provide additional information to help diagnose this issue.

Thanks!
 
Top