SOLVED How to cancel a task?

Eric999

Dabbler
Joined
Jan 3, 2021
Messages
16
TrueNAS-SCALE-22.02.2.1: I have a replication.run task that I want to cancel. I have been searching but I cannot find how to do this. I have seen a couple of suggestions to go to the shell, use htop to identify the process and the kill command. But there must be a way to cancel a task with the TrueNAS UI? Thanks for any pointers!

tasks.jpg
 

Attachments

  • tasks.jpg
    tasks.jpg
    19.3 KB · Views: 337

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
That's a replication task... so Data Protection | and under the list in Replication Tasks... maybe there's something that can be done there...
 

Eric999

Dabbler
Joined
Jan 3, 2021
Messages
16
Thanks, that would be the right place ... if I had done things correctly. When I created that replication task, I think I did it all wrong. And it is not being recognized as a configured replication task. If it were and showed up in the replication task pane, there would be a little trash can to delete it. Also note that I have a SMART Test running, which I kicked off in Storage -> Disks -> select three dots beside a disk and "Manual Test") and this is not showing up in the Smart Tests pane. I would also like to kill this test. Note that re-booting does not get rid of these.

Maybe I am stuck with htop and kill. Problem is, I don't know which out of the hundreds of processes relates to these two tasks.

1658505055333.png
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Thanks, that would be the right place ... if I had done things correctly. When I created that replication task, I think I did it all wrong. And it is not being recognized as a configured replication task. If it were and showed up in the replication task pane, there would be a little trash can to delete it. Also note that I have a SMART Test running, which I kicked off in Storage -> Disks -> select three dots beside a disk and "Manual Test") and this is not showing up in the Smart Tests pane. I would also like to kill this test. Note that re-booting does not get rid of these.

Maybe I am stuck with htop and kill. Problem is, I don't know which out of the hundreds of processes relates to these two tasks.

View attachment 57049


If you think you set-up the task correctly and can't kill it, we should treat that as a bug and report it.
 

Eric999

Dabbler
Joined
Jan 3, 2021
Messages
16
@morganL I can start with the simpler of the two: I can recreate the behavior of a manual SMART Test but I am unsure what I did wrong with the replication task.

SMART Test: I made a nice little mp4 of what I did but, unfortunately, cannot load it here. :( So I will attempt to describe it.
  1. I go to "Data Protection" and see that the "S.M.A.R.T. Tests" pane says "No S.M.A.R.T. Tests configured".
  2. I go to "Storage" -> "Disks" ->"Disks" and click on the three dots beside one of my HDDs, and select "Manual Test".
  3. I ask for a LONG test and "Start". It says it is expected to run until a few hours from now.
  4. In the Tasks Manager, I can see this new test running as smart.test.wait at 10%.
    I go back to "Data Protection" and see that the "S.M.A.R.T. Tests" pane still says "No S.M.A.R.T. Tests configured". If my new task were to show up here, I could trash it. I have no idea how I can stop it.
Is this surprising? Does it make sense to report this?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
You misunderstand the Data Protection tab. It’s essentially a front end for the system crontab, and doesn’t track running jobs. The GUI tracks running jobs in the Task Manager on the top bar, under the icon that looks like a clipboard.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
@morganL I can start with the simpler of the two: I can recreate the behavior of a manual SMART Test but I am unsure what I did wrong with the replication task.

SMART Test: I made a nice little mp4 of what I did but, unfortunately, cannot load it here. :( So I will attempt to describe it.
  1. I go to "Data Protection" and see that the "S.M.A.R.T. Tests" pane says "No S.M.A.R.T. Tests configured".
  2. I go to "Storage" -> "Disks" ->"Disks" and click on the three dots beside one of my HDDs, and select "Manual Test".
  3. I ask for a LONG test and "Start". It says it is expected to run until a few hours from now.
  4. In the Tasks Manager, I can see this new test running as smart.test.wait at 10%.
    I go back to "Data Protection" and see that the "S.M.A.R.T. Tests" pane still says "No S.M.A.R.T. Tests configured". If my new task were to show up here, I could trash it. I have no idea how I can stop it.
Is this surprising? Does it make sense to report this?
You've started a single manual test.... if you created a periodic task, you could delete it on this page. Is that the confusion?
 

Eric999

Dabbler
Joined
Jan 3, 2021
Messages
16
Yes, I guess that is my confusion. A periodic task can be killed off. A manual test cannot be killed off even if it will run for the next few hours/days.

My badly-defined replication task will run forever and I have no way of killing it. I think I will back up my config, re-install TrueNAS Scale, and apply the new config. Maybe that will get rid of it!

Is there a way to delete items in the Task Manager?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Yes, I guess that is my confusion. A periodic task can be killed off. A manual test cannot be killed off even if it will run for the next few hours/days.

My badly-defined replication task will run forever and I have no way of killing it. I think I will back up my config, re-install TrueNAS Scale, and apply the new config. Maybe that will get rid of it!

Is there a way to delete items in the Task Manager?

It's not a normal problem.... its a bad replication task that might never complete.
I would agree there needs to be a way to resolve this.. probably via CLI.

Looking through the ZFS commands, there is a way to abort a ZFS receive
(assuming you don't live in a state where abortion is now illegal)


If this doesn't work and no-one else has an answer, then "report a bug", but let's assume a CLI solution is reasonable.
 
Joined
Oct 22, 2019
Messages
3,641
@morganL I can start with the simpler of the two: I can recreate the behavior of a manual SMART Test but I am unsure what I did wrong with the replication task.
Sounds similar to the Middleware bug that even affects TrueNAS Core.

Can you SSH into the server and restart the middleware daemon?

Then re-login to the web GUI and see if it clears up?

What I'm implying is that the replication is not really "running", but has concluded, even though the Task Manager claims it's still in progress.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Sounds similar to the Middleware bug that even affects TrueNAS Core.

Can you SSH into the server and restart the middleware daemon?

Then re-login to the web GUI and see if it clears up?

What I'm implying is that the replication is not really "running", but has concluded, even though the Task Manager claims it's still in progress.
I agree it may be similar... but that specific bug was claimed to be fixed in SCALE 22.02.2

Restarting middleware seems like a reasonable workaround to try...
 

Eric999

Dabbler
Joined
Jan 3, 2021
Messages
16
Thanks for the comments and suggestions on this.

I just did a re-install with the current config. That got rid of the tasks.

"Restarting middleware" sounds above my head. :)
 
Joined
Oct 22, 2019
Messages
3,641
"Restarting middleware" sounds above my head. :)

You issue the command in an SSH session.

In Core (i.e, "FreeBSD") it would be the command:
service middlewared restart

In SCALE (i.e, "Debian Linux") it would be the command... I think...?
systemctl restart middlewared

Regardless, it must be run as the root user (or with sudo).
 
Last edited:

emsicz

Explorer
Joined
Aug 12, 2021
Messages
78
Just chiming in that I currently have a replication job running. It's periodic task has been removed and it's currently copying 11TB of data over network and I have no way of killing it.

The above command of
Code:
systemctl restart middewared
produces the following response:
Code:
Failed to restart middewared.service: Unit middewared.service not found.


In my case, I will have to drive to the server room, disable the target system, let the replication task fail and then see what happens. Again, how does one implement jobs with no way of killing them... I don't know :smile:

It is not the only insufficiency I find about the jobs. Today, I tried to access the list of jobs in /ui/jobs. The UI froze for minutes, nothing was clickable. Refreshing or relogging didn't work, because right after login, the UI tried to put me right back into /ui/jobs section, even if I refreshed the browser cache. So I had no way of doing anything with the whole system. Fortunately, after few minutes, the UI broke to life and it turned out it was stuck at listing all of the historical and current job runs.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
You misspelled the command. It's systemctl restart middlewared.
 

emsicz

Explorer
Joined
Aug 12, 2021
Messages
78
You misspelled the command. It's systemctl restart middlewared.
That is incorrect, I haven't made one mistake, but two.

1. First mistake was copypasting command from this forum, the typo is in @winnielinnie post.
2. Second was to run it in the first place. Executing it kicked me out of web UI, after logging in I see endless loop of some random notification job running. I can't VNC into VMs anymore and bunch of stuff doesn't work. I'm just praying that I will be able to shut everything down gracefully, do a restart and find my data intact.

Seriously, put a warning on that post or something. This was a terrible advice, it added more problems than I had.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Hmm. Typically a middleware restart is very benign. This sounds like your zombie task has confused the middleware sufficiently it's stuck in a loop trying to locate it in its jobs list.
 
Joined
Oct 22, 2019
Messages
3,641
@emsicz, that was for a different issue based on the GUI quirk that @Eric999 (the thread poster) was seeing. What you described makes more sense in its own thread, since it's a different issue. (I also said to SSH into the server, not to use the Shell, since the Shell is part of the web GUI itself.)

The OP decided to just re-install TrueNAS all over again, and thus this thread is moot.


It also sounds like there were other issues already:
In my case, I will have to drive to the server room, disable the target system, let the replication task fail and then see what happens.
Today, I tried to access the list of jobs in /ui/jobs. The UI froze for minutes, nothing was clickable. Refreshing or relogging didn't work, because right after login, the UI tried to put me right back into /ui/jobs section, even if I refreshed the browser cache. So I had no way of doing anything with the whole system. Fortunately, after few minutes, the UI broke to life and it turned out it was stuck at listing all of the historical and current job runs.


Typically a middleware restart is very benign.
This is also true.

You can also try to SSH into the server and tell it to reboot (or do a hard power cycle, or plug in a monitor and keyboard to view the console menu.)
 
Last edited:
Top