A new message with 11.1-RC1 and now 11.1, 11.1-U1, 11.1-U2, too

Status
Not open for further replies.

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
So you are suggesting to update anyway?
I would say that log spam shouldn't really enter into the decision of whether to upgrade or not.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
So you are suggesting to update anyway?
I did update and I don't plan to roll back. I don't care that much about some messages that are basically erroneous errors.
 

J Crnac

Cadet
Joined
Sep 18, 2016
Messages
2
Really, you think some extra entries in the log is a reason to reject an update that fixes so many other things?
Hi Chris,
most likely you are right saying that log spam is no sufficient reason to roll back. I use FreeNAS for 3 years (with growing enthusiasm), but having unstoppable warning messages is a kind of irritating. Furthermore (if I understand it right reading the bug-report) the Devs judge this bug differently in its significance (or the need to remove it).
Given the fact, that I had no problems with the previous version - I decided to roll back and wait (yeah, I know lot of bugs were removed in the actual version-but they have been no issue - at least for me). In the end it´s all about security of my data .... and I´m concerned
Lastly, using common sense - why should "we" ignore an alert?
Thanks for your comment
JC
 
Last edited by a moderator:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Hi Chris,
most likely you are right saying that log spam is no sufficient reason to roll back. I use FreeNas for 3 years (with growing enthusiasm), but having unstoppable warning messages is a kind of irritating. Furthermore (if I understand it right reading the bug-report) the Devs judge this bug differently in its significance (or the need to remove it).
Given the fact, that I had no problems with the previous version - I decided to roll back and wait (yeah, I know lot of bugs were removed in the actual version-but they have been no issue - at least for me). In the end it´s all about security of my data .... and I´m concerned
Lastly, using common sense - why should "we" ignore an alert?
Thanks for your comment
JC
What is the alert about? It is a false alarm.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
the Devs judge this bug differently in its significance (or the need to remove it).
...and correctly so. It's a minor issue. It doesn't affect performance, reliability, stability, or functionality. Yes, it's an annoyance. Yes, it clutters the log and makes it harder to find important stuff. And yes, I wish the devs would place a higher priority on minimizing spam, both to the logs and to the console, as this has been a recurring issue for years. But it harms nothing.
In the end it´s all about security of my data .... and I´m concerned
Then you'd probably be better off just turning off the option to show log messages in the GUI. Nothing about that message should make you concerned about your data.
Lastly, using common sense - why should "we" ignore an alert?
There is no alert. There's a message in the system log file that you're interpreting as an alert.
 

NASbox

Guru
Joined
May 8, 2012
Messages
650
Seriously? Roll back just for log spam? You're never going to be happy with FreeNAS, then--log spam (and even console spam), in various forms, has been an issue for a long time, with many versions, and the devs clearly don't care. It's annoying, to be sure, but it doesn't hurt anything.
This one is bad enough to make my logs totally useless... I've got 200K of this and almost nothing else.
 

NASbox

Guru
Joined
May 8, 2012
Messages
650
grep -v is your friend.
Yup... if there is something to grep... log keeps getting rotate out... I believe the default is a 200K ring buffer is it not? Or does it just archive after 200K? I think I'm going to try that patch mentioned above.

I've got to say it's a pretty lousy message in that it doesn't say what the issue is.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Or does it just archive after 200K?
I'm not seeing 200KB rollovers, but it does archive the last ten logs--messages.0.bz2 to messages.9.bz2.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Yup... if there is something to grep... log keeps getting rotate out... I believe the default is a 200K ring buffer is it not? Or does it just archive after 200K? I think I'm going to try that patch mentioned above.

I've got to say it's a pretty lousy message in that it doesn't say what the issue is.

It doesn't say what the issue is, because there isn't really an issue. It's an erroneous entry.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

NASbox

Guru
Joined
May 8, 2012
Messages
650
It doesn't say what the issue is, because there isn't really an issue. It's an erroneous entry.

I think you may be jumping to conclusions on this.... maybe you are right, maybe not.

There is no way of knowing based on the information available so far. Under 11.0U4 I have had real warnings and "bug warnings" and in both cases I get the same message. Two cases come to mind.

  1. I attached/used and then detached/removed a single volume pool BACKUP01 on /dev/ada3 (using the GUI), and sometime later I notice a warning (in the GUI) can't connect to device /dev/ada3, and that the log was full of these messages.
  2. I also about the pool being above 80% full (it was)-so it was a valid warning-I later noticed the log full of these messages.
The message didn't provide any helpful info as to what the real problem was. If I had not seen the messages in GUI, I wouldn't have had a clue what the warning was about.

The point is that there may be have been a bug introduced in some code (which may or may not be significant) that is causing that error to be issued.

Ideally instead of this:

2018/01/22 19:16:27 [WARN] agent: Check 'service:nas-health' is now warning

the message would be something like

2018/01/22 19:16:27 [WARN] agent: DESCRIPTIVE MESSAGE HERE

or

2018/01/22 19:16:27 [WARN] agent: Check 'service:nas-health' is now warning - ERROR CODE= SOMETHING

In either case it should point back to the condition and/or code responsible for giving rise to the message.

As an aside I did a quick grep of the older logs and I noticed these messages...
Code:
Jan 18 00:00:00 freenas newsyslog[70622]: logfile turned over due to size>200K
Jan 18 00:00:00 freenas syslog-ng[29910]: Configuration reload request received, reloading configuration;
Jan 18 13:14:34 freenas daemon[3691]: ==> Newer Consul version available: 1.0.2 (currently running: 1.0.0)
Jan 19 00:00:00 freenas syslog-ng[29910]: Configuration reload request received, reloading configuration;
Jan 19 16:30:10 freenas daemon[3691]: ==> Newer Consul version available: 1.0.2 (currently running: 1.0.0)
Do they have anything to do with the issue?
 

NASbox

Guru
Joined
May 8, 2012
Messages
650
@hervon: Thanks for confirming the emergence of this issue with 11.1-U1.
@Redcoat: I see. Somehow stopping a service completely, even if it's for a few weeks, makes me a bit nervous; especially that I don't know the inners of the software > Regular user! I'm not even sure if it's a big issue to have this message appear every two minutes in my log or if I should worry about it. Input from the FreeNAS experts would be much appreciated.

However, I kinda chose a halfway solution for now. I edited /usr/local/etc/consul.d/freenas.json and changed the Interval field from 120s to 1800s.

Code:
{																			
  "disable_update_check": true,												
  "service": {																
	"name": "nas-health",													
	"tags": ["primary"],														
	"address": "",															
	"port": 80,																
	"enableTagOverride": false,												
	"check": [																
	  {																		
		"id": "freenas_health",												
		"name": "FreeNAS health check",										
		"args": [ "/usr/local/etc/consul-checks/freenas_health.sh" ],		
		"interval": "1800s"													
	  }																		
	]																		
  }																			
}


Then executed the following:
service consul stop
rm -rf /var/db/system/consul
service consul start

Thanks for this... I followed the directions and just wanted to make sure I haven't created a "real" problem. Can someone with experience tell me if these messages are a problem?
Code:
Jan 22 19:12:20 freenas daemon[3691]:	 2018/01/22 19:12:20 [WARN] agent: Check 'service:nas-health' is now warning
Jan 22 19:14:24 freenas daemon[3691]:	 2018/01/22 19:14:24 [WARN] agent: Check 'service:nas-health' is now warning
Jan 22 19:16:27 freenas daemon[3691]:	 2018/01/22 19:16:27 [WARN] agent: Check 'service:nas-health' is now warning
Jan 22 19:18:31 freenas daemon[3691]:	 2018/01/22 19:18:31 [WARN] agent: Check 'service:nas-health' is now warning
--------------------------- DO I NEED TO WORRY ABOUT ANY OF THIS BELOW
Jan 22 19:20:25 freenas daemon[3691]:	 2018/01/22 19:20:25 [WARN] serf: Shutdown without a Leave
Jan 22 19:20:25 freenas daemon[3691]:	 2018/01/22 19:20:25 [WARN] serf: Shutdown without a Leave
Jan 22 19:20:26 freenas daemon[3691]:	 2018/01/22 19:20:26 [WARN] agent: Timeout stopping HTTP server 127.0.0.1:8500 (tcp)
Jan 22 19:20:27 freenas daemon[74214]: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
Jan 22 19:20:27 freenas daemon[74214]: bootstrap = true: do not enable unless necessary
Jan 22 19:20:27 freenas daemon[74214]: ==> Starting Consul agent...
Jan 22 19:20:27 freenas daemon[74214]: ==> Consul agent running!
Jan 22 19:20:27 freenas daemon[74214]:			Version: 'v1.0.0'
Jan 22 19:20:27 freenas daemon[74214]:			Node ID: 'a10------fb7'
Jan 22 19:20:27 freenas daemon[74214]:		  Node name: '----'
Jan 22 19:20:27 freenas daemon[74214]:		 Datacenter: 'dc1' (Segment: '<all>')
Jan 22 19:20:27 freenas daemon[74214]:			 Server: true (Bootstrap: true)
Jan 22 19:20:27 freenas daemon[74214]:		Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Jan 22 19:20:27 freenas daemon[74214]:	   Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
Jan 22 19:20:27 freenas daemon[74214]:			Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
Jan 22 19:20:27 freenas daemon[74214]:
Jan 22 19:20:27 freenas daemon[74214]: ==> Log data will now stream in as it occurs:
Jan 22 19:20:27 freenas daemon[74214]:
Jan 22 19:20:27 freenas daemon[74214]:	 2018/01/22 19:20:27 [WARN] serf: Failed to re-join any previously known node
Jan 22 19:20:27 freenas daemon[74214]:	 2018/01/22 19:20:27 [WARN] serf: Failed to re-join any previously known node
Jan 22 19:20:34 freenas daemon[74214]:	 2018/01/22 19:20:34 [WARN] raft: Heartbeat timeout from "" reached, starting election
Jan 22 19:21:04 freenas daemon[74214]:	 2018/01/22 19:21:04 [WARN] consul.kvs: Rejecting lock of consul-alerts/leader due to lock-delay until 2018-01-22 19:21:19.87925621 -0500 EST m=+52.559008210
Jan 22 19:21:09 freenas daemon[74214]:	 2018/01/22 19:21:09 [WARN] consul.kvs: Rejecting lock of consul-alerts/leader due to lock-delay until 2018-01-22 19:21:19.87925621 -0500 EST m=+52.559008210
Jan 22 19:21:14 freenas daemon[74214]:	 2018/01/22 19:21:14 [WARN] consul.kvs: Rejecting lock of consul-alerts/leader due to lock-delay until 2018-01-22 19:21:19.87925621 -0500 EST m=+52.559008210
--------------------------- ONLY @ MESSAGES / HOUR - A GOOD COMPROMISE
Jan 22 19:38:06 freenas daemon[74214]:	 2018/01/22 19:38:06 [WARN] agent: Check 'nas-health' is now warning
Jan 22 20:08:09 freenas daemon[74214]:	 2018/01/22 20:08:09 [WARN] agent: Check 'nas-health' is now warning
Jan 22 20:38:13 freenas daemon[74214]:	 2018/01/22 20:38:13 [WARN] agent: Check 'nas-health' is now warning
Jan 22 21:08:17 freenas daemon[74214]:	 2018/01/22 21:08:17 [WARN] agent: Check 'nas-health' is now warning
Jan 22 21:38:20 freenas daemon[74214]:	 2018/01/22 21:38:20 [WARN] agent: Check 'nas-health' is now warning
This seems like a nice work around as long as the solution is not ultimately worse than the problem.
 

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
How do I roll back?

Thanks

Screenshot 2018-01-23 10.01.26.png
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Reboot and you will go back to 11.0-U4
 

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
Rebooted 3 times already and no luck. Thought i was missing a step.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I think you may be jumping to conclusions on this.... maybe you are right, maybe not.

There is no way of knowing based on the information available so far. Under 11.0U4 I have had real warnings and "bug warnings" and in both cases I get the same message. Two cases come to mind.

  1. I attached/used and then detached/removed a single volume pool BACKUP01 on /dev/ada3 (using the GUI), and sometime later I notice a warning (in the GUI) can't connect to device /dev/ada3, and that the log was full of these messages.
  2. I also about the pool being above 80% full (it was)-so it was a valid warning-I later noticed the log full of these messages.
The message didn't provide any helpful info as to what the real problem was. If I had not seen the messages in GUI, I wouldn't have had a clue what the warning was about.

The point is that there may be have been a bug introduced in some code (which may or may not be significant) that is causing that error to be issued.

Ideally instead of this:

2018/01/22 19:16:27 [WARN] agent: Check 'service:nas-health' is now warning

the message would be something like

2018/01/22 19:16:27 [WARN] agent: DESCRIPTIVE MESSAGE HERE

or

2018/01/22 19:16:27 [WARN] agent: Check 'service:nas-health' is now warning - ERROR CODE= SOMETHING

In either case it should point back to the condition and/or code responsible for giving rise to the message.

As an aside I did a quick grep of the older logs and I noticed these messages...
Code:
Jan 18 00:00:00 freenas newsyslog[70622]: logfile turned over due to size>200K
Jan 18 00:00:00 freenas syslog-ng[29910]: Configuration reload request received, reloading configuration;
Jan 18 13:14:34 freenas daemon[3691]: ==> Newer Consul version available: 1.0.2 (currently running: 1.0.0)
Jan 19 00:00:00 freenas syslog-ng[29910]: Configuration reload request received, reloading configuration;
Jan 19 16:30:10 freenas daemon[3691]: ==> Newer Consul version available: 1.0.2 (currently running: 1.0.0)
Do they have anything to do with the issue?
No, the configuration has always reloaded at midnight, that is normal for a long time.
The consul version is a new thing but it is only indicating that there is a newer version of the software available and not an issue.
If the GUI is not alerting you to an error, why are you convinced that there is an error?
 

Mamdoh

Dabbler
Joined
Aug 12, 2017
Messages
12
I just applied a proper fix. Full credit goes to @David Steinberger (see link). Steps I followed:

Step one:
Code:
cd /usr/local/etc/consul-checks/
nano freenas_health.sh


Edited contents to (go through the edit line by line and make sure it's correct; simple paste can delete some escape characters, resulting in multiple commands in one line):

Code:
#!/bin/sh																														 
																																	
PATH="${PATH}:/usr/local/bin:/usr/local/sbin"																					 
export PATH																														
																																	
/usr/local/bin/midclt call notifier.get_alerts > /tmp/.alert-health																
if [ $? -ne 0 ] ; then																											
   exit 1																														 
fi																																
																																	
have_alert=0																														
																																	
while read line																													
do																																
  if [ -z $line ] ; then																											
	continue																														
  fi																																
  echo $line | grep -q "^OK"																										
  if [ $? -eq 0 ] ; then																											
	continue																														
  fi																																
  echo "$line"																													
  have_alert=1																													
done < /tmp/.alert-health																										 
rm /tmp/.alert-health																											 
																																	
if [ $have_alert -eq 0 ] ; then																									
  echo "No Alerts"																												
  exit 0																															
else																																
  exit 1																															
fi 


Step two: (not needed if you haven’t edited usr/local/etc/consul.d/freenas.json)
Code:
cd /usr/local/etc/consul.d
nano freenas.json


I changed the the interval back to 120s.

Code:
"interval": "120s"


Step three:
Executed the following in shell:

Code:
service consul stop
rm -rf /var/db/system/consul
service consul start
 
Last edited by a moderator:

picklefish

Explorer
Joined
Mar 13, 2016
Messages
62
Thanks @Mamdoh I will try this in an hour, appreciate the fix.

Edit: Seems like the only thing missing were the lines
Code:
  if [ -z $line ] ; then																											
   continue																														
  fi		


Gone for now, thanks
 
Last edited by a moderator:

zemun2

Explorer
Joined
Mar 15, 2017
Messages
72
How do I save the freenas_health.sh?

Thanks
 
Status
Not open for further replies.
Top