cmh
Explorer
- Joined
- Jan 7, 2013
- Messages
- 75
I've got several replications running from my FreeNAS (current 9.10.2) system to a CentOS system. (running current ZFS on Linux) I had a drive failure on the Linux system and so I'm redoing all the transfers. Taking some time as it is several TB of data, but it was working well until this afternoon.
Looking in the "replication tasks" tab of storage, I see the largest replication job listed as "Running", but when I go to the command line and look for any zfs send processes to see the progress, I see nothing. Looking on the target system, I see no inbound ssh connections or any related zfs receive processes. I had checked earlier in the day and saw that the zfs send process showed 87%, so initially thought it had completed, but not all the snapshots have been sent.
Source snapshots:
Only one target snapshot seems to have made it over:
I opened the replication task in the FreeNAS UI and closed it, no change. Opened it again and unchecked "enabled", saved it, still no change. It shows "Running" under status, but it's not. No autorepl processes are active.
Not much regarding replication in the syslog, just this:
There are two replication tasks which show "Succeeded" and "Up to date" in the status, and three more which are blank as they still need to be run.
Recursively replicate is checked (but there are no children datasets) and delete stale snapshots is unchecked. No limits, no dedicated user, and fast ciphers selected.
Any ideas what may have happened, or what to do to get it right?
Thanks!
Looking in the "replication tasks" tab of storage, I see the largest replication job listed as "Running", but when I go to the command line and look for any zfs send processes to see the progress, I see nothing. Looking on the target system, I see no inbound ssh connections or any related zfs receive processes. I had checked earlier in the day and saw that the zfs send process showed 87%, so initially thought it had completed, but not all the snapshots have been sent.
Source snapshots:
Code:
[root@nas /var/log]# zfs list -t snapshot -r sto/dcp NAME USED AVAIL REFER MOUNTPOINT sto/dcp@auto-20161213.2107-2w 2.87M - 873G - sto/dcp@auto-20161214.0907-2w 2.87M - 873G - sto/dcp@auto-20161214.2107-2w 2.87M - 873G - sto/dcp@auto-20161215.0907-2w 0 - 873G - sto/dcp@auto-20161215.2107-2w 0 - 873G - sto/dcp@auto-20161216.0907-2w 120K - 873G - sto/dcp@auto-20161216.2107-2w 184K - 887G - sto/dcp@auto-20161217.0907-2w 96K - 887G - sto/dcp@auto-20161217.2107-2w 96K - 887G - sto/dcp@auto-20161218.0907-2w 0 - 887G - sto/dcp@auto-20161218.2107-2w 0 - 887G - sto/dcp@auto-20161219.0907-2w 0 - 887G - sto/dcp@auto-20161219.2107-2w 0 - 887G - sto/dcp@auto-20161220.0907-2w 3.53M - 887G - sto/dcp@auto-20161221.0012-2w 96K - 887G - sto/dcp@auto-20161221.1212-2w 96K - 887G - sto/dcp@auto-20161222.0012-2w 96K - 887G - sto/dcp@auto-20161222.1212-2w 96K - 887G - sto/dcp@auto-20161223.0119-2w 96K - 887G - sto/dcp@auto-20161223.1319-2w 88K - 887G - sto/dcp@auto-20161224.0119-2w 0 - 887G - sto/dcp@auto-20161224.1319-2w 0 - 887G - sto/dcp@auto-20161225.0119-2w 3.96M - 887G - sto/dcp@auto-20161225.1319-2w 2.80M - 887G - sto/dcp@auto-20161226.0119-2w 255M - 892G - sto/dcp@auto-20161226.1319-2w 184K - 900G - sto/dcp@auto-20161227.0119-2w 3.15M - 901G - sto/dcp@auto-20161227.1319-2w 4.08M - 901G - sto/dcp@auto-20161228.0119-2w 96K - 901G - sto/dcp@auto-20161228.1319-2w 96K - 901G - sto/dcp@auto-20161229.0119-2w 0 - 901G - sto/dcp@auto-20161229.1319-2w 0 - 901G -
Only one target snapshot seems to have made it over:
Code:
8-ewok:~> sudo zfs list -t snapshot -r bak/nas/dcp NAME USED AVAIL REFER MOUNTPOINT bak/nas/dcp@auto-20161213.2107-2w 0 - 873G -
I opened the replication task in the FreeNAS UI and closed it, no change. Opened it again and unchecked "enabled", saved it, still no change. It shows "Running" under status, but it's not. No autorepl processes are active.
Not much regarding replication in the syslog, just this:
Code:
Dec 29 01:19:05 nas autosnap.py: [tools.autosnap:615] Autorepl running, skip destroying snapshots Dec 29 01:19:06 nas autorepl.py: [tools.autorepl:184] Checking if process 18333 is still alive Dec 29 01:19:06 nas autorepl.py: [tools.autorepl:188] Process 18333 still working, quitting
There are two replication tasks which show "Succeeded" and "Up to date" in the status, and three more which are blank as they still need to be run.
Recursively replicate is checked (but there are no children datasets) and delete stale snapshots is unchecked. No limits, no dedicated user, and fast ciphers selected.
Any ideas what may have happened, or what to do to get it right?
Thanks!