SOLVED VM running very slow, lots of swap_pager_getswapspace(##): failed messages

Joined
Mar 5, 2022
Messages
224
I have an Ubuntu VM that occasionally runs very slowly. It is set up with 4 CPU's and 4 threads each. When this happens, I get the swap_pager_getswapspace(##): failed errors on the console (MANY of them!) (Why don't I get these messages in the alerts section?)

This is from top ->o -> res while this is going on (Note that byve is going nuts!):
swap.jpg



Just killed the VM from the TrueNAS GUI and waited a few minutes:

swap2.jpg


Still getting tons of swap_pager_getswapspace(##): failed messages. Is there an issue with bhyve?
 
Last edited:

rvassar

Guru
Joined
May 2, 2018
Messages
972
We're going to need a bit more information on your system config, and the config of your VM. I do note that 91% of your swap space appears to be in use. So you're off into the realm of extreme memory starvation. The question is why?
 
Joined
Mar 5, 2022
Messages
224
We're going to need a bit more information on your system config, and the config of your VM. I do note that 91% of your swap space appears to be in use. So you're off into the realm of extreme memory starvation. The question is why?
What info do you need (and I am missing from my signature?)
 
Joined
Mar 5, 2022
Messages
224
This is what top ->o -> res looks like after a reboot:
1649469091211.png
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
What info do you need (and I am missing from my signature?)

That's pretty complete. I'm thinking we need a Bhyve expert to look at this. The Bhyve process doesn't seem to be consuming the memory, yet the swap is clearly in use at critical threshold levels, and is free'd by a reboot. There's some kind of bug lurking here...
 
Joined
Mar 5, 2022
Messages
224
That's pretty complete. I'm thinking we need a Bhyve expert to look at this. The Bhyve process doesn't seem to be consuming the memory, yet the swap is clearly in use at critical threshold levels, and is free'd by a reboot. There's some kind of bug lurking here...
Just messaged you (question on your signature)
 
Joined
Mar 5, 2022
Messages
224
Just started to interact with the VM:
swap4.jpg


Note that bhyve is still going crazy, but I am not getting any errors on the console
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
You're not chewing thru swap space yet. That bottom line:

Swap: XXXX Total, XXXX Free

Treat it like a credit card balance. If it's unused, you're good. If it's max'ed out you're in trouble.
 
Joined
Mar 5, 2022
Messages
224
Forgot to mention that when I boot, on the console screen, I get:
Code:
Solaris: WARNING: ignoring tunable zfs_arc_max (using 30064771072 instead)


At one time I had played around with the zfs_arc_max, but never got anywhere and everyone suggested to remove it and let the system tune itself. I did remove it:
1649473254379.png


But I continue to get the warning on boot. Is this a clue?
 
Joined
Mar 5, 2022
Messages
224
I just shut the VM down from the TrueNAS GUI and this time, for whatever reason, the errors stopped filling my console (:tongue:). Of course, I am not running the VM because it seems to definitely be the problem (:mad:).

Any suggestions would be most welcome!
 
Joined
Mar 5, 2022
Messages
224
I am unable to start my VM (I have not tried rebooting TrueNAS yet). When I try to start the VM, I get the dreaded "ERROR: Not Enough Memory" error. If I ignore this, I get "libvirtError internal error: client socket is closed" and the VM does not start
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
There's some kind of memory contention problem here that is preventing your 32Gb system from allocating 4Gb for the VM. The warning about the zfs_arc_max tuneable probably shouldn't be appearing, but it also should be treated as a limit, not a "go grab all this memory and lock it up" parameter. I would make an effort to figure out if that tuneable is truly back to the default and not some value that is forcing the system to guess incorrectly what value it should be. Failing that, I would probably set up a memory budget for all my Jails and VM's, and attempt to set that tunable to something reasonable. It looks like the system is picking roughly 30Gb (30064771072) as the limit. Maybe set it to 25769803776 and limit the ARC to 24Gb, then watch your arc stats to make sure you haven't impacted ZFS filesystem performance. But understand I'm just making an educated guess here. I fiddled with Bhyve a few years back and decided to stick with ESXi for things that need full virtualization. The BSD Jails work well, but Bhyve itself isn't particularly robust when compared to virtualization solutions like ESX, Linux KVM, Windows Hyper-V, etc...
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I have an Ubuntu VM that occasionally runs very slowly. It is set up with 4 CPU's and 4 threads
I only poked at bhyve a few times (prefer external hypervisors) but do you mean to say that you've set a config like below:

Code:
hw.vmm.topology.cores_per_package: 4
hw.vmm.topology.threads_per_core: 4


Because that's going to try to emulate a CPU that 1) doesn't exist and 2) will be trying to use sixteen threads (4 cores, 4 threads per core) which will heavily starve out your physical CPU.

Try setting it to 2+2 (two cores with HT) if you're going for four threads in-guest.
 
Joined
Mar 5, 2022
Messages
224
Try setting it to 2+2 (two cores with HT) if you're going for four threads in-guest.
Done and restarted the VM. Seems to be running OK so far (no performance hit on the VM anyway).

On another note, the TrueNAS unexpectedly rebooted this morning... I looked at the crash logs and could not really understand it all, but I saw this that gave me pause:
Code:
KDB: enter: panic


I can post the crash files, but am not sure if there is any sensitive data in them (suggestions?)
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
That's a kernel panic. The interesting bits are probably a few lines above that point. I would not attempt to post the crash dump files for the reasons you've stated, but the logs should be OK to share. Was the VM running when you hit this?
 
Joined
Mar 5, 2022
Messages
224
The VM was probably running (if it was, it was just on and not really doing any work.) Which logs should I post?
 
Joined
Mar 5, 2022
Messages
224
This is driving me crazy. It is so bad that it shuts down my VM and I am unable to start it without restarting the entire NAS...
 
Joined
Mar 5, 2022
Messages
224
So, FWIW, I upgraded to TrueNAS-12.0-U8.1 (had to re-point my BIOS to my mirrored SSD drives - that gave me a bit of a fright) and I am not seeing these messages any more (yay!). I am going to close this thread and hope that all of the others that are experiencing this problem will be able to benefit from this.
 
Top