Best way to restart a service running in a jail (automated)

Status
Not open for further replies.

ViciousXUSMC

Dabbler
Joined
May 12, 2014
Messages
49
I have searched around a bit and not found a good final answer.
In the case of say Transmission running in a jail, it flakes out once in a while so rather than being reactive and manually finding it down and rebooting the jail or the service I think I will just set a daily restart of that service/jail.

A few of the answers I have found were laced with "you should not do that, it might break something" and I have broken things before so checking in first.

My first idea:
Warden Stop %JailName%
Warden Start %JailName%

I think Warden is pretty safe and just used it the other day to fix my whole template missing thing when I moved my jails root.

Next idea:
jexec %jailid% tcsh
service transmission restart

I just need to make sure I get the right jail ID and I think this should work well.

Last one:
/etc/rc.d/jail onestart
/etc/rc.d/jail onestop

I do not know exactly what to do with this one and it seems like its pretty old hat, but tossing it in as an answer.

So is there a best (safe) way? and how do you do it?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Hum, the best way to do it is to fix your installation of transmission so it's not crashing.

But that iwas not your question so... The answer depends on what crashed. If you wanted something simple and you are unsure on how the jail crashed then I'd stop and restart the jail via the CRON. For this I'd use a script myself to be executed by CRON but you could also enter the two commands as well into CRON and run then a few minutes apart. The script would be something like this...

Code:
# This script will stop a jail, verify that it is no longer listed, and then start the jail.
warden stop transmission
 
# Lets wait 120 seconds to allow the jail to shutdown.  Why so long, because I have no idea why the system crashed.
wait 120
 
# Now lets start up the jail again.
warden start transmission
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Or you can simply restart the service within the jail. jexec (jail name or number) service transmission restart.

Edit: You can use the jexec command with jail names as well as numbers. So, if the jail's name is "transmission", the command would simply be jexec transmission service transmission restart.
 

ViciousXUSMC

Dabbler
Joined
May 12, 2014
Messages
49
Hum, the best way to do it is to fix your installation of transmission so it's not crashing.

But that iwas not your question so... The answer depends on what crashed. If you wanted something simple and you are unsure on how the jail crashed then I'd stop and restart the jail via the CRON. For this I'd use a script myself to be executed by CRON but you could also enter the two commands as well into CRON and run then a few minutes apart. The script would be something like this...

Code:
# This script will stop a jail, verify that it is no longer listed, and then start the jail.
warden stop transmission

# Lets wait 120 seconds to allow the jail to shutdown.  Why so long, because I have no idea why the system crashed.
wait 120

# Now lets start up the jail again.
warden start transmission

Very basic install, both just the built in plugin or a manual jail both do the same.
The jail itself is not crashing and the transmission service usually is running so nothing is actually crashing but transmission becomes unresponsive, cant access the web interface and such.

A friend at work who uses the plugin has the same issue, especially with large torrents.

If you have any links to some debug troubleshooting I would be more than happy to look into why/how its breaking, but for now yes a bandaid is better than bleeding out.

Thanks for the script.
 

ViciousXUSMC

Dabbler
Joined
May 12, 2014
Messages
49
Or you can simply restart the service within the jail. jexec (jail name or number) service transmission restart.

Edit: You can use the jexec command with jail names as well as numbers. So, if the jail's name is "transmission", the command would simply be jexec transmission service transmission restart.

This I just found out earlier just by looking at the help file, and is exactly how I decided to do it, since the name does not change and is more reliable than ID's when creating/destroying jails causing new ID's to generate.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Both ways work to restart the transmission service, the difference is using warden you stop the entire jail and start it back up. This clears all errors that could be happening within the jail. Using jexec to restart transmission will only stop and restart the transmission service within the jail. If that is all you need to do and things work fine, then I would recommend the jexec command @danb35 pointed out. The only reason I didn't list it was because you have no idea why your jail stopped working. If the jexec does not work all the time then I would recommend using the warden command.

The jail itself is not crashing and the transmission service usually is running so nothing is actually crashing but transmission becomes unresponsive, cant access the web interface and such.
Sounds like transmission had a crash by definition. It may have got stuck in some loop, gone off to La La Land. And this happen buy manually installing the current version of transmission, odd. How much RAM do you have? Is your system using SWAP space? If this is happening with the plug-in also, hopefully there is a bug report on it (nope I haven't looked myself).
 

ViciousXUSMC

Dabbler
Joined
May 12, 2014
Messages
49
Both ways work to restart the transmission service, the difference is using warden you stop the entire jail and start it back up. This clears all errors that could be happening within the jail. Using jexec to restart transmission will only stop and restart the transmission service within the jail. If that is all you need to do and things work fine, then I would recommend the jexec command @danb35 pointed out. The only reason I didn't list it was because you have no idea why your jail stopped working. If the jexec does not work all the time then I would recommend using the warden command.


Sounds like transmission had a crash by definition. It may have got stuck in some loop, gone off to La La Land. And this happen buy manually installing the current version of transmission, odd. How much RAM do you have? Is your system using SWAP space? If this is happening with the plug-in also, hopefully there is a bug report on it (nope I haven't looked myself).

Yes just restarting the service works, it's quite random it may work 5 days straight or go out 2x in one day.
System has currently 64GB of RAM a few days ago it was 32GB of RAM.

Very basic setup no VM's or enterprise level workloads just my media server that downloads/shares/plex.

Swap usage right now looks flat line at no use, but next time I see it crash I'll see if that changed.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
That is a good idea, when it crashes then try to note anything and everything. If you create a CRON job to restart transmission and it happens again, well I'm not sure where to go from there. I have lots of ideas but not sure what the best one for you would be. One idea would be to create an Ubuntu VM and place transmission on it. This may work much better than the jail but that means you will need to figure out how to create a VM.
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Or you can simply restart the service within the jail. jexec (jail name or number) service transmission restart.

Edit: You can use the jexec command with jail names as well as numbers. So, if the jail's name is "transmission", the command would simply be jexec transmission service transmission restart.

This doesn't seem to work. Oddly, it does work if you run the cron task with the "Run Now" button. And it does seem to work the first time it fires. But, subsequent tries seem to fail.

I can verify that the cron timing is correct because it does send output email on the correct schedule. Perhaps a FreeNAS bug in cron tasks?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
This doesn't seem to work. Oddly, it does work if you run the cron task with the "Run Now" button. And it does seem to work the first time it fires. But, subsequent tries seem to fail.

I can verify that the cron timing is correct because it does send output email on the correct schedule. Perhaps a FreeNAS bug in cron tasks?
You did not specify what version of FreeNAS you are running, pretty important data. Also, are you stating that this fails to work with transmission or some other jail? And details about the jail could be important. But it could be somewhat broken. Some jails can get wrapped around the axles and require the job to be killed. Who are you logged on as when you run the command and it fails? Could be a privledges thing.
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
You did not specify what version of FreeNAS you are running, pretty important data. Also, are you stating that this fails to work with transmission or some other jail? And details about the jail could be important. But it could be somewhat broken. Some jails can get wrapped around the axles and require the job to be killed. Who are you logged on as when you run the command and it fails? Could be a privledges thing.

Thank you.

FreeNAS version: FreeNAS-11.1-U5

It is a transmission plugin jail. It was set up with the plugin GUI. With some modification since then to update packages.

The command is being run as root from the Cron GUI. I'm not sure how those permissions translate into the jail. But, it does run successfully if you use the "Run Now" button in the GUI. So, it seems like the permissions are ok.
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
My solution to this general issue (an rc started service crashing) is the excellent tool Monit: https://mmonit.com/monit/

It can monitor a PID file, process name, process CPU usage, etc. and run commands to restart the task as appropriate. There are ways to trigger when an intermittent condition persists (X number of times across Y checks) and you can make monitored tasks dependent on others such that a single process can restart just that process or a whole suite of tasks.

The only issue I've run in to is with multiple jails I have multiple monit jobs so I'd need to pay a decent amount of money for M/Monit. I've thought about requesting it to be added to the base FreeNAS install and then one instance could run commands to track tasks in all the jails at once.

Oh, you can configure notification services. Every time a service is restarted (rarely) I get a push notification letting me know.

It's a really great tool.
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
My solution to this general issue (an rc started service crashing) is the excellent tool Monit: https://mmonit.com/monit/

It can monitor a PID file, process name, process CPU usage, etc. and run commands to restart the task as appropriate. There are ways to trigger when an intermittent condition persists (X number of times across Y checks) and you can make monitored tasks dependent on others such that a single process can restart just that process or a whole suite of tasks.

The only issue I've run in to is with multiple jails I have multiple monit jobs so I'd need to pay a decent amount of money for M/Monit. I've thought about requesting it to be added to the base FreeNAS install and then one instance could run commands to track tasks in all the jails at once.

Oh, you can configure notification services. Every time a service is restarted (rarely) I get a push notification letting me know.

It's a really great tool.

Thank you. Looks promising.

Would you be up to outlining the steps for installing it in FreeNAS and setting it to restart transmission if/when it crashes?

A few questions I have:

Do you install monit within the jail to monitor? Or, do you make a separate jail for it? How does either case affect accessing the web panel for it?

Can it be installed via a package manager? Does a repository have to be added?
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
I tried installing monit through the port package. But, when trying to make it gives an unsupported version error. I don't know FreeBSD well enough to navigate around that one without further research. Maybe I'm not approaching it in the easiest way?
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
I installed monit via packages ( pkg install monit) inside the relevant jail.

To configure, I took the sample config (/usr/local/etc/monitrc.sample), copied it to /usr/local/etc/monitrc and made the following changes (line numbers are based off a diff from the sample file, so they might not be exact):
  • 20: uncomment startup delay
The startup delay gives your monitored tasks some time to get started.
  • 34: uncomment and change pid to /var/run/monit/monit.pid
  • 40: uncomment and change idfile to /var/run/monit/monit.id
  • 48: uncomment and change state file to /var/run/monit/monit.state
None of those are required, but I felt they kept things organized. In fact, changing the pidfile location will break service monit status unless you also create the /var/run/monit directory and change the pidfile location in /usr/local/etc/rc.d/monit
  • 84: add mailserver configuration (so alerts can get out):
Code:
set mailserver smtp.host.com port ####
	username “source_user@host.com" password “password”
	using ssl

  • 141: add alert destination email address (I'm using a Pushover email alias)
Code:
set alert destination_user@host.com					   # receive all alerts

  • eof: add an entry to monitor the transmission daemon (assuming that's where the transmission pid file is):
Code:
check process transmission with pidfile /var/run/transmission/daemon.pid
	start program = "/usr/sbin/service transmission restart" with timeout 60 seconds
	stop program  = "/usr/sbin/service transmission stop"



Enable the service with sysrc monit_enable="YES" and you can then start monit with service monit start.

Check the status with monit -c /usr/local/etc/monitrc summary. You can omit the '-c /path/to/config' if you execute the command from the same directory as the config file.

Let me know if any of this doesn't work or you need clarification.
 
Last edited:

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Let me know if any of this doesn't work or you need clarification.

Thank you! This is great. I want to try this out. But, I get stuck on the first line.

When I run "pkg install monit", I get this:

Code:
Updating FreeBSD repository catalogue...										
pkg: http://pkg.FreeBSD.org/freebsd:11:x86:64/latest/meta.txz: No address record
repository FreeBSD has no meta file, using default settings					 
pkg: http://pkg.FreeBSD.org/freebsd:11:x86:64/latest/packagesite.txz: No address
 record																		 
Unable to update repository FreeBSD											 
Error updating repositories!


I'm not sure from that text what my next steps are to fix?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Fix the networking problem in your jail.

I got it to work simply be restarting the jail. Which is why I am wanting to have monit monitor and restart it in the first place.

Will see how far I can get in this guide now...
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Let me know if any of this doesn't work or you need clarification.

I get through to the end of the steps. Entering "service monit start" results in:

Code:
Starting monit.																													 
Error opening the idfile '/var/run/monit/monit.id' -- No such file or directory													 
Starting Monit 5.25.2 daemon with http interface at [localhost]:2812																
Monit start delay set to 240s


It looks like there could be a path issue. But, maybe this is expected and it creates the path?

I tried reaching the http interface by entering [serverip]:2812 and also [jailip]:2812 and neither showed anything. Is that an indication that it isn't running right? Or, are we just looking to work with it without access to the http interface?
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
"monit -c /usr/local/etc/monitrc summary" gives:

Code:
Monit: the monit daemon is not running 


Is it that the files don't include "." in front of their names? I noticed that the sample had it as ".monit.id". But, I wanted to follow your guide exactly and changed to "monit.id".

[EDIT: looks like there is no /var/run/monit directory... Do I create it all? Or, point somewhere else?]
 
Status
Not open for further replies.
Top