A new message with 11.1-RC1 and now 11.1, 11.1-U1, 11.1-U2, too

Status
Not open for further replies.

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
ctrl W (for write) I believe it shows at the bottom of the nano screen the different commands
This is all the options i get.
sh.png
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Ctl X for Exit, then Y to save - it'll prompt you for the save and the correct file name
 

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
Awesome thank you sir
 

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
I did ^O and now I get these options.

sh2.png
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I did ^O and now I get these options.
See in the white bar where it's asking for the file name to write, and the correct file name is filled in? Press Enter.
 

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
See in the white bar where it's asking for the file name to write, and the correct file name is filled in? Press Enter.
That worked. For some reason I thought I needed to select one of the options on the bottom.

Thank you all
 

NASbox

Guru
Joined
May 8, 2012
Messages
650
No, the configuration has always reloaded at midnight, that is normal for a long time.
The consul version is a new thing but it is only indicating that there is a newer version of the software available and not an issue.
Thanks for this... I didn't realize that this was normal behavior or the implications of there being a newer release.
If the GUI is not alerting you to an error, why are you convinced that there is an error?
If the code shown here:

https://forums.freenas.org/index.php?threads/a-new-message-with-11-1-rc1.58844/page-3#post-433317

is the code responsible for detecting errors:

Then my question is what about this?
Code:
/usr/local/bin/midclt call notifier.get_alerts > /tmp/.alert-health
if [ $? -ne 0 ] ; then																											
   exit 1																														
fi																																
What is /usr/local/bin/midclt call notifier.get_alerts doing?

I assume that sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0]) is spawning a bunch of processes, but I have no idea what is actually happening. (Have there been any changes to the reporting code? Could a poorly reported error give a blank line instead of a proper message?)

If the supplied patch is working, then it appears that /tmp/.alert-health just contains an empty line(s). As long as midclt returning blank lines and a zero return code means no error detected, then I concur with you that there is no error message. If whatever scripts are executed by midclt are emitting a blank line instead of an error message, then there is something to worry about.

Having said all that, I'm inclined to agree with you it is unlikely there is an error.
 

toolchain

Cadet
Joined
Jul 13, 2016
Messages
8
Thanks @Mamdoh I will try this in an hour, appreciate the fix.

Edit: Seems like the only thing missing were the lines

if [ -z $line ] ; then
continue
fi


Gone for now, thanks

Just building on this thought process, but I just wanted to neuter the script and be done with it.

So in this section of the code where it sets have_alert=1

Code:
fi																																
  echo "$line"																													
  have_alert=1																													
done < /tmp/.alert-health																										
rm /tmp/.alert-health		


I replaced the 1 with a 0 and it stopped spamming dead in it's tracks ¯\_(ツ)_/¯

Code:
fi																																
  echo "$line"																													
  have_alert=0																													
done < /tmp/.alert-health																										
rm /tmp/.alert-health		
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

ykhodo

Explorer
Joined
Oct 19, 2017
Messages
52
I just applied a proper fix. Full credit goes to @David Steinberger (see link). Steps I followed:

Step one:
cd /usr/local/etc/consul-checks/
nano freenas_health.sh


Edited contents to (go through the edit line by line and make sure it's correct; simple paste can delete some escape characters, resulting in multiple commands in one line):

Code:
#!/bin/sh																														
																																	
PATH="${PATH}:/usr/local/bin:/usr/local/sbin"																					
export PATH																														
																																	
/usr/local/bin/midclt call notifier.get_alerts > /tmp/.alert-health																
if [ $? -ne 0 ] ; then																											
   exit 1																														
fi																																
																																	
have_alert=0																														
																																	
while read line																													
do																																
  if [ -z $line ] ; then																											
	continue																														
  fi																																
  echo $line | grep -q "^OK"																										
  if [ $? -eq 0 ] ; then																											
	continue																														
  fi																																
  echo "$line"																													
  have_alert=1																													
done < /tmp/.alert-health																										
rm /tmp/.alert-health																											
																																	
if [ $have_alert -eq 0 ] ; then																									
  echo "No Alerts"																												
  exit 0																															
else																																
  exit 1																															
fi 


Step two: (not needed if you haven’t edited usr/local/etc/consul.d/freenas.json)
cd /usr/local/etc/consul.d
nano freenas.json


I changed the the interval back to 120s.

Code:
"interval": "120s"


Step three:
Executed the following in shell:

service consul stop
rm -rf /var/db/system/consul
service consul start
alternatively just
Code:
wget
the file and overwrite your old one

Code:
cd /usr/local/etc/consul-checks
mv freenas_health.sh freenas_health.sh.bak
wget https://raw.githubusercontent.com/davidsteinberger/freenas/8bf0718c1c3a09f8189c5150c0e22f3dbc4e77e9/src/freenas/usr/local/etc/consul-checks/freenas_health.sh
# grant execute to the file
chmod +x freenas_health.sh
 

I-Tech

Dabbler
Joined
Aug 14, 2015
Messages
36
I just applied a proper fix. Full credit goes to @David Steinberger (see link). Steps I followed:

Step one:
cd /usr/local/etc/consul-checks/
nano freenas_health.sh


Edited contents to (go through the edit line by line and make sure it's correct; simple paste can delete some escape characters, resulting in multiple commands in one line):

Code:
#!/bin/sh																														
																																	
PATH="${PATH}:/usr/local/bin:/usr/local/sbin"																					
export PATH																														
																																	
/usr/local/bin/midclt call notifier.get_alerts > /tmp/.alert-health																
if [ $? -ne 0 ] ; then																											
   exit 1																														
fi																																
																																	
have_alert=0																														
																																	
while read line																													
do																																
  if [ -z $line ] ; then																											
	continue																														
  fi																																
  echo $line | grep -q "^OK"																										
  if [ $? -eq 0 ] ; then																											
	continue																														
  fi																																
  echo "$line"																													
  have_alert=1																													
done < /tmp/.alert-health																										
rm /tmp/.alert-health																											
																																	
if [ $have_alert -eq 0 ] ; then																									
  echo "No Alerts"																												
  exit 0																															
else																																
  exit 1																															
fi 


Step two: (not needed if you haven’t edited usr/local/etc/consul.d/freenas.json)
cd /usr/local/etc/consul.d
nano freenas.json


I changed the the interval back to 120s.

Code:
"interval": "120s"


Step three:
Executed the following in shell:

service consul stop
rm -rf /var/db/system/consul
service consul start
ok.. so.. did this..
two questions:
1- removing the /var/db/system/consul folder .. well.. that disappears and doesn't come back.. right?
2- edits are not persistent through reboots, right?
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
1. Yes, assuming it was there in the first place (-f suppresses any output).
2. Right.
 

I-Tech

Dabbler
Joined
Aug 14, 2015
Messages
36
1. Yes, assuming it was there in the first place (-f suppresses any output).
2. Right.
was kinda concerned that the /var/db/system/consul folder didn't get re-created by anything ..
hope it wasn't important :p
 

NASbox

Guru
Joined
May 8, 2012
Messages
650
I applied the patch, but I think something must have happened (I accidentally turned /usr/local/etc/consul-checks/freenas_health.sh into a null file for about 5 minutes when I was preparing a script to reinstall the patch) because I kept getting warning messages:
Code:
Jan 26 09:31:07 freenas daemon[16739]:	 2018/01/26 09:31:07 [WARN] agent: Check 'service:nas-health' is now warning
Jan 26 09:33:07 freenas daemon[16739]:	 2018/01/26 09:33:07 [WARN] agent: Check 'service:nas-health' is now warning
Jan 26 09:35:07 freenas daemon[16739]:	 2018/01/26 09:35:07 [WARN] agent: Check 'service:nas-health' is now warning
I thought I should follow up this point that I made earlier regarding:
It doesn't say what the issue is, because there isn't really an issue. It's an erroneous entry.
I'm currently experiencing a latent error is not being handled properly.

I decided to apply a debug patch to: /usr/local/etc/consul-checks/freenas_health.sh and capture /tmp/.alert-health:
Code:
#!/bin/sh

PATH="${PATH}:/usr/local/bin:/usr/local/sbin"
export PATH

/usr/local/bin/midclt call notifier.get_alerts > /CUSTOM/temp-alert-health-debug
/usr/local/bin/midclt call notifier.get_alerts > /tmp/.alert-health
if [ $? -ne 0 ] ; then
   exit 1
fi

have_alert=0

while read line
do
  if [ -z $line ] ; then
	continue
  fi
  echo $line | grep -q "^OK"
  if [ $? -eq 0 ] ; then
	continue
  fi
  echo "$line"
  have_alert=1
done < /tmp/.alert-health
rm /tmp/.alert-health

if [ $have_alert -eq 0 ] ; then
  echo "No Alerts"
  exit 0
else
  exit 1
fi
and I got the following:
Code:
[ENOMETHOD] Method "get_alerts" not found in "notifier"
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/notifier.py", line 66, in __getattr__
	return object.__getattribute__(self, attr)
AttributeError: 'NotifierService' object has no attribute 'get_alerts'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 881, in _method_lookup
	methodobj = getattr(serviceobj, method_name)
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/notifier.py", line 68, in __getattr__
	return getattr(_n, attr)
AttributeError: 'notifier' object has no attribute 'get_alerts'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 150, in call_method
	result = await self.middleware.call_method(self, message)
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 109, in __next__
	return self.gen.send(None)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 889, in call_method
	serviceobj, methodobj = self._method_lookup(message['method'])
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 883, in _method_lookup
	raise CallError(f'Method "{method_name}" not found in "{service}"', CallError.ENOMETHOD)
middlewared.service_exception.CallError: [ENOMETHOD] Method "get_alerts" not found in "notifier"
I also noticed that /var/db/system/consul
Code:
#>ls -la /var/db/system/consul
ls: /var/db/system/consul: No such file or directory
did not get regenerated after
Code:
  service consul stop
  rm -rf /var/db/system/consul
  service consul start
I rebooted the system and the problem cleared. (It should be noted that /var/db/system/consul is still non-existent after a reboot.)

Any idea what the python errors are about? Does this give any clues about the ongoing issue(s)?

My thinking is that if this is a predictable error, then it should be properly handled with try/except, and if not, there should be an improvement to /usr/local/etc/consul-checks/freenas_health.sh (maybe grep Traceback () to capture the error for troubleshooting.

EDIT: added /usr/local/bin/midclt call notifier.get_alerts > /CUSTOM/temp-alert-health-debug and removed cp statement that would have cleared $?.
 
Last edited:

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
I rebooted the system and the problem cleared. (It should be noted that /var/db/system/consul is still non-existent after a reboot.)

Yes, we talked about this above. We need someone on 11.1-U1 who did not attempt this fix (or at least did not run "rm -rf /var/db/system/consul") to tell us if the file exists at all in a "clean" 11.1-U1. My present assumption is that it does not.

My attempts to use this "fix" did not result in cessation of the messages BTW.
 
Joined
Jan 7, 2018
Messages
8
Yes, we talked about this above. We need someone on 11.1-U1 who did not attempt this fix (or at least did not run "rm -rf /var/db/system/consul") to tell us if the file exists at all in a "clean" 11.1-U1. My present assumption is that it does not.

My attempts to use this "fix" did not result in cessation of the messages BTW.


Hi,

I did a fresh install of 11.1U1 and cannot find the /var/db/system/consul .... and cannot find "system" folder. I have not applied any changes or modified the system in any way yet....

Code:
root@freenas:/var/db # la																										  
./					  hyperv/				 nut/					samba4/				 syslog-ng.persist				  
../					 ipf/					pbi/					services.db			 zfsd/							  
collectd/			   locate.database		 pkg/					sss/					zoneinfo							
dhclient.leases.vmx0	netdata/				ports/				  sss_mc/													
entropy/				ntp/					portsnap/			   sudo/													  
freebsd-update/		 ntpd.leap-seconds.list  samba/				  syslog-ng.ctl=											
root@freenas:/var/db #	  
 

I-Tech

Dabbler
Joined
Aug 14, 2015
Messages
36
Yes, we talked about this above. We need someone on 11.1-U1 who did not attempt this fix (or at least did not run "rm -rf /var/db/system/consul") to tell us if the file exists at all in a "clean" 11.1-U1. My present assumption is that it does not.

My attempts to use this "fix" did not result in cessation of the messages BTW.
I just rebooted in 11.1 release.. deleted 11.1-U1.. (with the applied fix).. re-updated to 11.1-U1
.. and the /var/db/system/consul folder is (still) not there..
(which was the main reason I performed these steps)
 
Status
Not open for further replies.
Top