OK, I have a strange problem here. One of my TrueNAS boxes (SAN2) is acting up and I'm not sure where to look to find the answers. So, here's the scenario:
Replication between SAN1 and SAN2 never fails. Works every time.
I installed a large disk in SAN2 and formatted as NTFS so I could make an archival single-disk backup that could be read anywhere (hence NTFS)
Since I can't seem to find a way to mount it from the GUI, and fstab gets reset on reboot, I SSH in and manually mount the drive. Then, using tmux, I start an rsync job to move the data over. This job just trucks right along for a while until it fails.
The failure:
The remote system (SAN2) drops the connection. No problem, right? you set this all up in a tmux window, so just log back in and do a tmux attach and your job should still be running.
NOPE.
Not only does it kill the ssh, but it kills the tmux session, which kills the rsync job. Then! it UNMOUNTS the ntfs volume!
When this happens, if I happen to have the GUI open, it goes to the "waiting for the interface to load" screen. Once it's back up and the GUI is available, I can log back in, start tmux, mount the drive and restart the rsync job.
Yeah, sure, it will eventually all get copied, but I'm trying to figure out what would cause SSH, tmux AND a mounted drive to all go away at the same time.
The server is NOT rebooting. It comes back WAY too fast for that. I have had replication tasks (zfs send) that have taken 15+ hours never drop, even when the SSH/Tmux/rsync/mount problem occurs.
So SOMETHING is resetting.
Attached is the support dump file.
Replication between SAN1 and SAN2 never fails. Works every time.
I installed a large disk in SAN2 and formatted as NTFS so I could make an archival single-disk backup that could be read anywhere (hence NTFS)
Since I can't seem to find a way to mount it from the GUI, and fstab gets reset on reboot, I SSH in and manually mount the drive. Then, using tmux, I start an rsync job to move the data over. This job just trucks right along for a while until it fails.
The failure:
The remote system (SAN2) drops the connection. No problem, right? you set this all up in a tmux window, so just log back in and do a tmux attach and your job should still be running.
NOPE.
Not only does it kill the ssh, but it kills the tmux session, which kills the rsync job. Then! it UNMOUNTS the ntfs volume!
When this happens, if I happen to have the GUI open, it goes to the "waiting for the interface to load" screen. Once it's back up and the GUI is available, I can log back in, start tmux, mount the drive and restart the rsync job.
Yeah, sure, it will eventually all get copied, but I'm trying to figure out what would cause SSH, tmux AND a mounted drive to all go away at the same time.
The server is NOT rebooting. It comes back WAY too fast for that. I have had replication tasks (zfs send) that have taken 15+ hours never drop, even when the SSH/Tmux/rsync/mount problem occurs.
So SOMETHING is resetting.
Attached is the support dump file.