ZFS Replication to Proxmox server

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
I am trying to set up ZFS replication between my TrueNAS box and my new & shiny RAIDZ2 pool that I created on my Proxmox server. I tried setting up the SSH connection and then setting up a Replication task, but it has failed each time.

First thing I tried was
  1. I created Replication task
  2. selected the Manual option since my destination is a non TrueNAS device
  3. provided the proxmox IP address generated a new key in the dialog.
  4. Then when expanding the Destination folders it gives this error
Code:
Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 654, in _handle_ssh_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
    return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 13, in list_datasets
    return [dataset["name"] for dataset in list_datasets_with_properties(shell, dataset, recursive)]
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 30, in list_datasets_with_properties
    output = shell.exec(args)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 89, in exec
    return self.exec_async(args, encoding, stdout).wait(timeout)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 93, in exec_async
    async_exec.run()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 27, in run
    client = self.shell.get_client()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 123, in get_client
    client.connect(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 435, in connect
    self._auth(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 764, in _auth
    raise saved_exception
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 664, in _auth
    self._transport.auth_publickey(username, pkey)
  File "/usr/local/lib/python3.9/site-packages/paramiko/transport.py", line 1580, in auth_publickey
    return self.auth_handler.wait_for_response(my_event)
  File "/usr/local/lib/python3.9/site-packages/paramiko/auth_handler.py", line 250, in wait_for_response
    raise e
paramiko.ssh_exception.AuthenticationException: Authentication failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 138, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self,
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 975, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/replication.py", line 642, in list_datasets
    return await self.middleware.call("zettarepl.list_datasets", transport, ssh_credentials)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1256, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/contextlib.py", line 199, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 657, in _handle_ssh_exceptions
    raise CallError(repr(e).replace("[Errno None] ", ""), errno=errno.EACCES)
middlewared.service_exception.CallError: [EACCES] AuthenticationException('Authentication failed.')



The second way I tried was
  1. to create a SSH keypair under System-->SSH Keypairs on TrueNAS
  2. Then I copied the public key and pasted it under the Proxmox root user's authorized_keys file
  3. Then I created a new SSH Connection in TrueNAS using System-->SSH Connections
  4. Then I tried to create a Replication task and used the previously created SSH Connection.
However, when I try to expand the Destination folder, I get the following error.
Code:

Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 654, in _handle_ssh_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
    return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 13, in list_datasets
    return [dataset["name"] for dataset in list_datasets_with_properties(shell, dataset, recursive)]
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 30, in list_datasets_with_properties
    output = shell.exec(args)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 89, in exec
    return self.exec_async(args, encoding, stdout).wait(timeout)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 93, in exec_async
    async_exec.run()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 27, in run
    client = self.shell.get_client()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 123, in get_client
    client.connect(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 435, in connect
    self._auth(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 764, in _auth
    raise saved_exception
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 664, in _auth
    self._transport.auth_publickey(username, pkey)
  File "/usr/local/lib/python3.9/site-packages/paramiko/transport.py", line 1580, in auth_publickey
    return self.auth_handler.wait_for_response(my_event)
  File "/usr/local/lib/python3.9/site-packages/paramiko/auth_handler.py", line 250, in wait_for_response
    raise e
paramiko.ssh_exception.AuthenticationException: Authentication failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 138, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self,
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 975, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/replication.py", line 642, in list_datasets
    return await self.middleware.call("zettarepl.list_datasets", transport, ssh_credentials)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1256, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/contextlib.py", line 199, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 657, in _handle_ssh_exceptions
    raise CallError(repr(e).replace("[Errno None] ", ""), errno=errno.EACCES)
middlewared.service_exception.CallError: [EACCES] AuthenticationException('Authentication failed.')


What am I doing wrong?
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
Any pointers here please.....
 

bigjay517

Dabbler
Joined
Jan 14, 2015
Messages
14
Do you have a way to validate that you copied the SSH key correctly? The line [EACCES] AuthenticationException('Authentication failed.') indicates that there was an SSH authentication error.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
Do you have a way to validate that you copied the SSH key correctly? The line [EACCES] AuthenticationException('Authentication failed.') indicates that there was an SSH authentication error.
How would I validate it?

I generated a ssh key on the Proxmox and copied over to the TrueNAS root user -- the ssh connection from Proxmox to TrueNAS works via CLI.

Then, I also generated a ssh key using the command line on TrueNAS, copied that over to Proxmox and can again connect from TrueNAS to Proxmox using the CLI. However, that key is not visible in the SSH Keypairs.

When I create the ssh key from System --> SSH Keypairs, the key is not available in /root/.ssh. So I am not sure how to validate the connection using a key generated in the UI. I have copied the public key and pasted that in the authorized_keys over on the Proxmox root user.

When creating a SSH Connection in the UI, I am using the Manual option because my remote system is not TrueNAS. If I put
  1. just the IP address of Proxmox in the Host URL, I get an error, xxx.yyy.zzz.ppp is not found in known_hosts.
  2. just the hostname of proxmox in the Host URL, I get the error proxmox is not found in known_hosts.
  3. https://proxmox.domain.com in the Host URL, I get the error "Name does not resolve"
  4. https://IP of proxmox in the Host URL, I get the error "Name does not resolve"

and use the key created on the UI (System-->SSH Keypairs) when setting up the Replication Task

Not found in known_hosts error:
Code:
Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 654, in _handle_ssh_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
    return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 13, in list_datasets
    return [dataset["name"] for dataset in list_datasets_with_properties(shell, dataset, recursive)]
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 30, in list_datasets_with_properties
    output = shell.exec(args)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 89, in exec
    return self.exec_async(args, encoding, stdout).wait(timeout)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 93, in exec_async
    async_exec.run()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 27, in run
    client = self.shell.get_client()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 123, in get_client
    client.connect(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 415, in connect
    self._policy.missing_host_key(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 823, in missing_host_key
    raise SSHException(
paramiko.ssh_exception.SSHException: Server '192.168.1.5' not found in known_hosts

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 138, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self,
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 975, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/replication.py", line 642, in list_datasets
    return await self.middleware.call("zettarepl.list_datasets", transport, ssh_credentials)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1256, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/contextlib.py", line 199, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 657, in _handle_ssh_exceptions
    raise CallError(repr(e).replace("[Errno None] ", ""), errno=errno.EACCES)
middlewared.service_exception.CallError: [EACCES] SSHException("Server '192.168.1.5' not found in known_hosts")


Name does not resolve error:
Code:
Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 654, in _handle_ssh_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
    return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 13, in list_datasets
    return [dataset["name"] for dataset in list_datasets_with_properties(shell, dataset, recursive)]
  File "/usr/local/lib/python3.9/site-packages/zettarepl/dataset/list.py", line 30, in list_datasets_with_properties
    output = shell.exec(args)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 89, in exec
    return self.exec_async(args, encoding, stdout).wait(timeout)
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/interface.py", line 93, in exec_async
    async_exec.run()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 27, in run
    client = self.shell.get_client()
  File "/usr/local/lib/python3.9/site-packages/zettarepl/transport/base_ssh.py", line 123, in get_client
    client.connect(
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 340, in connect
    to_try = list(self._families_and_addresses(hostname, port))
  File "/usr/local/lib/python3.9/site-packages/paramiko/client.py", line 203, in _families_and_addresses
    addrinfos = socket.getaddrinfo(
  File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] Name does not resolve

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 138, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self,
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 975, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/replication.py", line 642, in list_datasets
    return await self.middleware.call("zettarepl.list_datasets", transport, ssh_credentials)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1256, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1213, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 409, in list_datasets
    datasets = await self.middleware.run_in_thread(list_datasets, shell)
  File "/usr/local/lib/python3.9/contextlib.py", line 199, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zettarepl.py", line 657, in _handle_ssh_exceptions
    raise CallError(repr(e).replace("[Errno None] ", ""), errno=errno.EACCES)
middlewared.service_exception.CallError: [EACCES] gaierror(8, 'Name does not resolve')



I didn't think setting up replication would be this difficult.
 
Last edited:

awasb

Patron
Joined
Jan 11, 2021
Messages
415
It is a nightmare. Even between TrueNAS (Core) boxens (IMHO). I switched my offsite replication targets to omnios/nappit and used zrep ever since I tried to implement offsite pull replication. Nothing beats the command line with stdout. I was so fed of that plethora of python exits I had to dig into, just to find out that there is a (comparably) trivial problem … again and again. Most of the time ist was trial and error, because the docs are insufficient, too (again: IMHO) once you skip „half automation“. A section on typical use cases with some simple howtos (including some caveats) would have been quite nice. Maybe I was too stupid to find them back then. But it seems to me, you‘ll still have to dig the forums keeping fingers crossed while hoping, that the methods/howtos/hints for your use case dating back to 2009 (random number!) are still valid.

But as I said: I gave up on this and chose the KISS way. Within the time it would probably take to evaluate and write the docs/hints you could setup several replication host systems with a less integrated, by far better documented (speaking of quality, not word count) and therefore more transparent approach.
 
Last edited:

bigjay517

Dabbler
Joined
Jan 14, 2015
Messages
14
How would I validate it?

I generated a ssh key on the Proxmox and copied over to the TrueNAS root user -- the ssh connection from Proxmox to TrueNAS works via CLI.

Then, I also generated a ssh key using the command line on TrueNAS, copied that over to Proxmox and can again connect from TrueNAS to Proxmox using the CLI. However, that key is not visible in the SSH Keypairs.

When I create the ssh key from System --> SSH Keypairs, the key is not available in /root/.ssh. So I am not sure how to validate the connection using a key generated in the UI. I have copied the public key and pasted that in the authorized_keys over on the Proxmox root user.

When creating a SSH Connection in the UI, I am using the Manual option because my remote system is not TrueNAS. If I put
  1. just the IP address of Proxmox in the Host URL, I get an error, xxx.yyy.zzz.ppp is not found in known_hosts.
  2. just the hostname of proxmox in the Host URL, I get the error proxmox is not found in known_hosts.
  3. https://proxmox.domain.com in the Host URL, I get the error "Name does not resolve"
  4. https://IP of proxmox in the Host URL, I get the error "Name does not resolve"

and use the key created on the UI (System-->SSH Keypairs) when setting up the Replication Task
I followed the same steps you described and was able to connect my TrueNAS to my Proxmox server first try. I used the same #1 option you tried. I didn't encounter any of the errors you did at any of those steps. It did just work.

I have a button under SSH Connection labeled "Discover Remote Host Key". I pressed this after entering my host IP, username, and port. TrueNAS populated the Remote Host Key field immediately. Without any such not found in known_hosts errors.

I wonder what is different about our setups? I did this on TrueNAS-12.0-U7.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
I followed the same steps you described and was able to connect my TrueNAS to my Proxmox server first try. I used the same #1 option you tried. I didn't encounter any of the errors you did at any of those steps. It did just work.
Ok... I might have to clean up everything that I have done up until now and try again. Which option did you try? Are the following steps all you took?

  1. You generated the key under System-->SSH Keypairs but also used the Discover remote Host Key button.
  2. Created a SSH connection using the key that you just created.
  3. Used that connection when trying to create the Replication task.

On the Proxmox side, did you not have to copy the public key to the root user's authorized_key file?
What's you current status of the zpool? My Proxmox zpool is brand new. I have created tank... but it doesn't have anything else in it. The replication would have copied all the data over. After the initial replication, my aim was to make it incremental.
 
Last edited:

bigjay517

Dabbler
Joined
Jan 14, 2015
Messages
14
Correct. I coped the SSH public key to the Proxmox server root user's ~/.ssh/authorized_keys file. I have copied the steps you described and added the two things I did in addition.

  1. You generated the key under System-->SSH Keypairs but also used the Discover remote Host Key button
    Copied the public key from TrueNAS to Proxmox
  2. Created a SSH connection using the key that you just created.
    Used the 'Discover Remote Host Key' button to get the Remote Host Key
  3. Used that connection when trying to create the Replication task.

Both of my systems the TrueNAS box and the Proxmox server have datasets already in the pools. When I expand the folder structure on the Proxmox remote server from TrueNAS I see the pool and sub-datasets.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
I followed the exact same steps but I still get a EACCES - Authentication failure exception.

  1. I created a key in System --> SSH Keypairs using the Generate keypair option. This generated a ssh-rsa key
  2. I copied the public key and pasted it in the /root/.ssh/authorized_keys on the Proxmox server
  3. I created a SSH Connection, used the manual option, put the IP address of Proxmox under Host. Selected the Private key that I generated in step 1. Then clicked Discover Remote Host Key. This pulled in 3 different keys, so I tried it with all listed in the connection and also individually but each option gave me an Authentication failed exception.
  4. Used the connection in the creation of the Replication Task.
  5. Result is "EACCES - Authentication Failed."
What am I doing wrong that it keeps failing each time for me. What are you using as the Host url in the SSH Connection? just the IP of proxmox or something else?
 

bigjay517

Dabbler
Joined
Jan 14, 2015
Messages
14
I am using the IP directly in the host url of SSH connection. I was able to keep the 3 keys discovered from the 'Discover Remote Host Key' operation and still connect properly.

Could it be some problem during copy pasting of the public key? Maybe try using the download public key option to copy the key rather than from the Web GUI? You could even download the private key and use that to login to the Proxmox server from TrueNAS via CLI to ensure that the public key was copied correctly.
 

Inxsible

Guru
Joined
Aug 14, 2017
Messages
1,123
I was pasting the public key using a ssh connection to the proxmox box and then restarting sshd. Turned out that I needed to kill sshd completely :mad: and then use the Proxmox UI to restart sshd. It then worked. Should have thought of that a bit earlier. !!

I ran a task to copy my "documents" dataset -- but that changed the zpool size to the size of the documents dataset. So I guess I made some mistake. I just destroyed my pool on Proxmox and am starting all over again. This time instead of only creating a pool on the Proxmox zpool, I might create the pool & the datasets as well and then create 4 separate replication tasks for the 4 datasets.
 
Top