Script to pagein any used swap to prevent kernel crashes

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I tested hot-swap on my server that I'm commisioning today, and the kernel crashed because of this known issue
https://forums.freenas.org/index.php?threads/swap-with-9-10.42749/

In response I've written a script:

Code:
# This script is designed to page in used swap on any device that has swap in use on
# a FreeNAS system. You would want to run this periodically to help ensure that swap
# is not in use on a device which may fail at any time, for example, a drive in a raid set


I'm simply running it off a cron job every hour at the moment. If there's no swap in use, it takes no time.

setting up a cron job.png


If you have "Redirect Output" unticked, you'll get an email if there was swap in use.

enabling output.png


This was based on a simpler version of the script in this post, but this version is much better and will actually work reliably.

Here is the script:

Code:
#!/usr/local/bin/perl

# This script is designed to page in used swap on any device that has swap in use on
# a FreeNAS system. You would want to run this periodically to help ensure that swap
# is not on use on a device which may fail at any time, for example, a drive in a raid set
#
# It accomplishes this by swapoff/swapon a given device rather than swapoff all devices,
# the idea being that if swap was actually necessary, then the system will always have access
# to swap is swap is never fully disabled.
#
# This is in response to an issue that became apparent on FreeNAS 9.10
#	 https://forums.freenas.org/index.php?threads/swap-with-9-10.42749/
#
# The script does nothing if there is no swap in use.
#
# To use, simply run via cron. Or directly when needed.
#
# If debug = 1, and you untick "Redirect Output" in FreeNAS Add Cron Job,
# you'll get an email only when swap was paged in.
#
# This script lives at:
# https://forums.freenas.org/index.php?threads/script-to-pagein-any-used-swap-to-prevent-kernel-crashes.46206/
#
# stux


# VERSION HISTORY
#####################
# 2016-09-21	 Initial Version
# 2016-09-21	 Tightenned swapinfo grep to only use /dev/*.eli devices. This prevents a potential
#		issue after hot swapping devices.
# 2017-02-02	Fixed cosmetic issue. Bytes -> KiBs

###############################################################################################
## CONFIGURATION
################

## DEBUG LEVEL
## 0 means no output. 1,2,3,4 provide more verbosity
## 1 output only if paging in. If run from cron, this means you'll only get an email if
##   swap was in use.
## 2 output startup message and at least nothing done message
## 3 more.
## 4 etc.
$debug = 1;

## PAGEIN ALL
## A debug option which is used to verify functionality even if there is no swap in use
## 0 means only pagein active swap devices. You should use 0.
## 1 means pagein all devices.
$pagein_all = 0;


#modify nothing below here
###############################################################
use POSIX qw(strftime);

dprint( 1, "Paging in swap...\n" );

pagein_swap_devices( get_swap_devices());



######### SUBS

# returns a nested list of active swap devices. Each item in the list is an array,
# [0] is the device
# [1] is the bytes used
sub get_swap_devices
{
	my $swapinfo_output = `swapinfo | grep "/dev/.*\.eli"`;
	chomp $swapinfo_output;
	dprint(1,"swapinfo:\n$swapinfo_output\n");
	
	my @swap_lines = split("\n", $swapinfo_output );
	dprint_list(3,"swap_lines", @swap_lines);	
	

	my @swap_devs = ();
	foreach my $line (@swap_lines)
	{
		dprint(3,"$line\n");
		
		my @vals = split(" ", $line);
		
		# [0] = device
		# [1] = 1K-blocks
		# [2] = Used
		# [3] = Avail
		# [4] = Capacity

		if( $vals[2] > 0 || $pagein_all )
		{
			dprint(2, "Adding $vals[0] with $vals[2]KiB of swap to page in list.\n" );
			push @swap_devs, [$vals[0],$vals[2]];
		}		
	}

	return @swap_devs;
}

#pages in a device
sub pagein_swap_device
{
	my ($device,$used) = @_;

	dprint(0,"Paging in $used KiBs on $device\n");
	
	`swapoff $device`;
	`swapon $device`;

	return;
}

sub pagein_swap_devices
{
	my @devs = @_;

	if( @devs > 0 )
	{
		foreach my $devinfo (@devs)
		{
			my $dev = $devinfo->[0];
			my $used = $devinfo->[1];
				
			#dprint(0, "dev: $dev used: $used\n");
			pagein_swap_device($dev,$used);
		}
	}
	else
	{
		dprint(1,"No swap was paged in as no swap was used on any device\n");
	}

	return;
}


sub build_date_string
{
	my $datestring = strftime "%F %H:%M:%S", localtime;
	
	return $datestring;
}

sub dprint_list
{
	my ( $level,$name,@output) = @_;
		
	if( $debug > $level )
	{
		dprint($level,"$name:\n");

		foreach my $item (@output)
		{
			dprint( $level, " $item\n");
		}
	}

	return;
}

sub dprint
{
	my ( $level,$output) = @_;
	
#	print( "dprintf: debug = $debug, level = $level, output = \"$output\"\n" );
	
	if( $debug > $level )
	{
		my $datestring = build_date_string();
		print "$datestring: $output";
	}

	return;
}

 

Attachments

  • page_in_swap.pl.zip
    1.8 KB · Views: 787
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I just got an email informing me that swap had been paged back in again... due to the script :)

page_in_fired.png


You can see 33M of swap began being used at 7:30pm ish, and at 8pm, page_in_swap was fired and cleared up the problem for the moment... of course, 200k is being used again.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
count_of_cron_users += 1;

All swap device are detected properly on my system. Thanks for your effort.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Can someone move this script into Useful Scripts please :)

I think its very useful, and it should see more use ;)
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
Can someone move this script into Useful Scripts please :)

I think its very useful, and it should see more use ;)

Please check the output of this script for a flaw in units. Compared to the reported Swap Utilisation in the FreeNAS GUI I'm pretty sure that "KiloBytes" in the output would be correct instead of "Bytes".

Besides this minor problem the script reliably anticipates unwanted swap usage after each and every scrub for months now on my system, currently running FreeNAS-9.10.2-U1 (86c7ef5).

Unfortunately there's no progress in bug report #17690. Would it make sense to look at the swap behavior of the FreeNAS 9.10 with FreeBSD 11 / 12 test builds?
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Please check the output of this script for a flaw in units. Compared to the reported Swap Utilisation in the FreeNAS GUI I'm pretty sure that "KiloBytes" in the output would be correct instead of "Bytes".

Besides this minor problem the script reliably anticipates unwanted swap usage after each and every scrub for months now on my system, currently running FreeNAS-9.10.2-U1 (86c7ef5).
Hmmm. That would explain a few things...

Will take a look at this

Unfortunately there's no progress in bug report #17690. Would it make sense to look at the swap behavior of the FreeNAS 9.10 with FreeBSD 11 / 12 test builds?

Sigh
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Hmmm. That would explain a few things...

Will take a look at this

Code:
/dev/da7p1.eli	2097152		0  2097152


Yeah, silly me. The values are in 1KB pages not bytes.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Updated the script. Its a cosmetic change only Bytes -> KiBs
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
I am having issues with this script no longer paging in the swap in 9.10.2-U2. Not sure why.

To manually execute, I use ./page_in_swap.pl. The output is (when using debug level 3):

2017-04-16 19:15:49: Paging in swap...
2017-04-16 19:15:49: swapinfo:
2017-04-16 19:15:49: No swap was paged in as no swap was used on any device

yet TOP shows me this:

Swap: 4096M Total, 190M Used, 3905M Free, 4% Inuse

Not sure why.

I do use a swap file as per this command (instead of on the drives):

Code:
echo "md99 none swap sw,file=/usr/swap0,late 0 0" >> /etc/fstab && swapon -aL


I am sure this used to work. Any help would be groovy.

Cheers,
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I am having issues with this script no longer paging in the swap in 9.10.2-U2. Not sure why.

To manually execute, I use ./page_in_swap.pl. The output is (when using debug level 3):



yet TOP shows me this:



Not sure why.

I do use a swap file as per this command (instead of on the drives):

Code:
echo "md99 none swap sw,file=/usr/swap0,late 0 0" >> /etc/fstab && swapon -aL


I am sure this used to work. Any help would be groovy.

Cheers,

@Scharbag, the script uses the following command to get the swap device list
swapinfo | grep "/dev/.*\.eli"

It's specically looking for the encrypted swap devices on HDs that FreeNAS makes.
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
Ahh, that makes sense.

Would you happen to know the syntax to determine the swap files that are created when using this type of swap setup:

Code:
echo "md99 none swap sw,file=/usr/swap0,late 0 0" >> /etc/fstab && swapon -a


I would assume that the following should work:

Code:
my $swapinfo_output = `swapinfo | grep "/dev/md99"`;


I tested this and it seems to work.

Code:
# ./page_in_swap.pl
2017-08-14 20:47:27: Paging in swap...
2017-08-14 20:47:27: swapinfo:
/dev/md99		 4194304	 1384  4192920	 0%
2017-08-14 20:47:27: swap_lines:
2017-08-14 20:47:27:  /dev/md99		 4194304	 1384  4192920	 0%
2017-08-14 20:47:27: /dev/md99		 4194304	 1384  4192920	 0%
2017-08-14 20:47:27: Adding /dev/md99 with 1384KiB of swap to page in list.
2017-08-14 20:47:27: Paging in 1384 KiBs on /dev/md99


Just want to make sure I am not going to cause any major SNAFUs.

Thank you very much for the response!!

Cheers,
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Of course, if your swap device is an ssd or boot device that you don't expect to be failing then it doesn't make much sense to be running this script. Just let it swap if it wants to
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
That is very true... Thanks. The SSD is on the ESXi system and should be reliable.

Cheers,
 

Blackie

Dabbler
Joined
Aug 10, 2017
Messages
14
hi,
Setup is as shown but I get: "Permission Denied"
Tried it manually in a shell and get the same.
I used Putty/SSH to make the file page_in_swap.pl in the root folder. Then I used nano to edit the file and pasted in the information.
Could this have caused a problem?
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
have you "chmod a+x" the file?
 

Blackie

Dabbler
Joined
Aug 10, 2017
Messages
14
Thanks for the reply. I have not done that since I don't understand. Very new at this.

Edit: Ok I found chmod a-x removes execute permission for all classes so I am guessing that +x allows execute.

I will go read more and do that to the file.

Thanks
 
Last edited:

LIGISTX

Guru
Joined
Apr 12, 2015
Messages
525

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
No,

The assumption is that your boot drive is an SSD and thus more reliable than the pool drives.
 

LIGISTX

Guru
Joined
Apr 12, 2015
Messages
525
No,

The assumption is that your boot drive is an SSD and thus more reliable than the pool drives.
I am running an SSD. So, that would make this something I want to employ?
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I am running an SSD. So, that would make this something I want to employ?

If I had moved swap to an ssd boot drive, I would not use this script

It won’t do any harm though.
 
Top