ZFS Replication to Backup Server - "Authentication Failed" After Update to Bluefin 22.12.4

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Hi all, posting this here since I maybe posted it previously in a half-dead sub-forum. Deleting the thread there.

I have been doing ZFS send replications from my main TrueNAS Scale server to a backup TrueNAS Scale server, and after updating to Bluefin 22.12.4, got this error.

image.png


Any idea why this might be the case? Appreciate any guidance here, relatively new to TrueNAS Scale. Thanks!
 
Joined
Oct 22, 2019
Messages
3,641
Very little to go bye. You'll only have users taking potshots into the wind.

Maybe the update caused a mismatch between the (required/minimum) supported ciphers over SSH?
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Hmm, where might I find more useful info on the cause? When I go to click in the failed replication it just shows me a tiny window in which I excprcted more detail, but it just says "authentication failed," haha.
 
Joined
Oct 22, 2019
Messages
3,641

But this is just a shot in the dark. If you didn't configure anything in this section, it's not likely the reason.

You can also read the logs under /var/logs/
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
So I had followed the ZFS replication guide by TN themselves, and I don't remember setting up much in that section, but it's been a hot minute and I definitely did some configuring of things without fully understanding what I was doing.

On my main server, SSH service is not enabled - on the backup (receiving) it is. That said, I experimentally tried turning on SSH on the main server, uncertain if that had been on earlier to make it work - nothing. Here is how both of them appear, configuration-wise:

1696719687409.png


The ZFS send is on an internal network and I prioritized speed, so I remember not setting encryption or anything of the sort for it. Not sure if something got reset on that front.

I checked my /var/log directory and under syslog, for the time period of the last error, I see the following:

Code:
Oct  7 04:00:00 Chimaera systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Oct  7 04:00:00 Chimaera systemd[1]: sysstat-collect.service: Deactivated successfully.
Oct  7 04:00:00 Chimaera systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Oct  7 04:00:01 Chimaera systemd-udevd[1385047]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct  7 04:00:01 Chimaera systemd-udevd[1385047]: Using default interface naming scheme 'v247'.
Oct  7 04:00:01 Chimaera kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vetha82d4a66: link becomes ready
Oct  7 04:00:01 Chimaera kernel: kube-bridge: port 11(vetha82d4a66) entered blocking state
Oct  7 04:00:01 Chimaera kernel: kube-bridge: port 11(vetha82d4a66) entered disabled state
Oct  7 04:00:01 Chimaera kernel: device vetha82d4a66 entered promiscuous mode
Oct  7 04:00:01 Chimaera kernel: kube-bridge: port 11(vetha82d4a66) entered blocking state
Oct  7 04:00:01 Chimaera kernel: kube-bridge: port 11(vetha82d4a66) entered forwarding state
Oct  7 04:00:01 Chimaera systemd-udevd[1385047]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct  7 04:00:02 Chimaera systemd-udevd[1385047]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct  7 04:00:02 Chimaera kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth241c1b6e: link becomes ready
Oct  7 04:00:02 Chimaera kernel: kube-bridge: port 34(veth241c1b6e) entered blocking state
Oct  7 04:00:02 Chimaera kernel: kube-bridge: port 34(veth241c1b6e) entered disabled state
Oct  7 04:00:02 Chimaera kernel: device veth241c1b6e entered promiscuous mode
Oct  7 04:00:02 Chimaera kernel: kube-bridge: port 34(veth241c1b6e) entered blocking state
Oct  7 04:00:02 Chimaera kernel: kube-bridge: port 34(veth241c1b6e) entered forwarding state
Oct  7 04:00:02 Chimaera systemd-udevd[1385047]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct  7 04:00:03 Chimaera kernel: kube-bridge: port 11(vetha82d4a66) entered disabled state
Oct  7 04:00:03 Chimaera kernel: device vetha82d4a66 left promiscuous mode
Oct  7 04:00:03 Chimaera kernel: kube-bridge: port 11(vetha82d4a66) entered disabled state
Oct  7 04:00:04 Chimaera kernel: kube-bridge: port 34(veth241c1b6e) entered disabled state
Oct  7 04:00:04 Chimaera kernel: device veth241c1b6e left promiscuous mode
Oct  7 04:00:04 Chimaera kernel: kube-bridge: port 34(veth241c1b6e) entered disabled state
Oct  7 04:00:35 Chimaera smartd[5675]: Device: /dev/sdb [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 63 to 62
Oct  7 04:00:35 Chimaera smartd[5675]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 63 to 62
Oct  7 04:05:01 Chimaera systemd-udevd[1442716]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct  7 04:05:01 Chimaera systemd-udevd[1442716]: Using default interface naming scheme 'v247'.
Oct  7 04:05:01 Chimaera kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethbd28ca17: link becomes ready
Oct  7 04:05:01 Chimaera kernel: kube-bridge: port 11(vethbd28ca17) entered blocking state
Oct  7 04:05:01 Chimaera kernel: kube-bridge: port 11(vethbd28ca17) entered disabled state
Oct  7 04:05:01 Chimaera kernel: device vethbd28ca17 entered promiscuous mode
Oct  7 04:05:01 Chimaera kernel: kube-bridge: port 11(vethbd28ca17) entered blocking state
Oct  7 04:05:01 Chimaera kernel: kube-bridge: port 11(vethbd28ca17) entered forwarding state
Oct  7 04:05:01 Chimaera systemd-udevd[1442716]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.


No idea if there is anything of value in there. I also checked the error and auth logs, and didn't see anything that stood out at all.
 
Joined
Oct 22, 2019
Messages
3,641
Elsewhere would be to check the "SSH Connections" page on both ends.

I'm assuming you used the "semi-auto" setup to create an SSH Connection to be used for a TrueNAS-to-TrueNAS replication?

(In a pure command-line setting, a verbose flag would give you a better hint why the authentication failed.)
 
Joined
Jul 27, 2023
Messages
6
did you get this working? I have the same issue (replicating from main truenas to backup truenas has failed since updating the backup one (thought I would update that one first as its less important incase of issues)
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Sorry, got occupied this past weekend. Nothing yet.

Elsewhere would be to check the "SSH Connections" page on both ends.

I'm assuming you used the "semi-auto" setup to create an SSH Connection to be used for a TrueNAS-to-TrueNAS replication?

(In a pure command-line setting, a verbose flag would give you a better hint why the authentication failed.)
However, per this recommendation to investigate (thank you), I do see the following on my main server, but NOT on my backup server (all blanks for all three):

1696884435947.png
I don't know if it should also be appearing on my backup server, or if it ever did. It was a ZFS send replication (push).

Maybe this is the culprit?
 
Joined
Jul 27, 2023
Messages
6
thanks yes got it working again - this seems like a pretty major bug - is it common for updates in truenas to just break stuff? I have not been using it long and this is the first update I have done..
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
thanks yes got it working again - this seems like a pretty major bug - is it common for updates in truenas to just break stuff? I have not been using it long and this is the first update I have done..
What account were you using for the replication job? I just fixed an installer bug where it wasn't preserving the "admin" user's home directory on upgrade.
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
You might have to re-configure the SSH pair between the two servers, based on this:

Then select this new (working?) SSH credential for the Replication Task.

This may have something to do with recent changes to the non-root admin accounts.
Oh, O.K., I'll try to do this today, thanks!

thanks yes got it working again - this seems like a pretty major bug - is it common for updates in truenas to just break stuff? I have not been using it long and this is the first update I have done..
Anything important to know?

What account were you using for the replication job? I just fixed an installer bug where it wasn't preserving the "admin" user's home directory on upgrade.
Such as this, haha?
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Well...re-doing the SSH connection and selecting it in the replication task did not work. If I try to manually run it - same thing - nearly instant "authentication failed."

Not sure why this is happening then. Using admin account with sudo enabled for everything, built-in admins group I also checked and changed as such.

I see an option in the replication task for logging. If I set that to something beyond "default," will I get more useful info perhaps?
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Here are the details of the replication task:

1697073963265.png


Here are the details of the SSH connection - still only has something on the main server (Chimaera), not on the backup server (Lusankya):

1697073989018.png


And here are the details of the admin account (on the main server):

1697074019819.png


Oh, and here are the network configurations for both servers:

1697074277319.png
1697074305131.png


Really not sure what I'm doing wrong here. Maybe something leaps out to someone smarter than me?
 

Attachments

  • 1697074146145.png
    1697074146145.png
    55.5 KB · Views: 117
  • 1697073740425.png
    1697073740425.png
    37.2 KB · Views: 113
  • 1697073621658.png
    1697073621658.png
    228.1 KB · Views: 121

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
I still have not resolved this. Do I just delete the replication tasks and start over?

Is something supposed to appear on the backup server side under SSH pairs or connections?

And do I need to be copying the public or privacy key from either server to either server?
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
thanks yes got it working again - this seems like a pretty major bug - is it common for updates in truenas to just break stuff? I have not been using it long and this is the first update I have done..
Did you do anything uniquely to resolve this? Any changes to the admin account?
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Have not got this resolved. Don't know what else to do except delete the entire replication and try to set it up from scratch.
 

pr0927

Dabbler
Joined
Jul 16, 2023
Messages
23
Well, updated both servers, was going to try one more bite at deleting the SSH keypair/connection and...this happens:

1698130816750.png
 
Top