Bhyve with Ubuntu 19.04 - keeps locking up?

spiceygas

Explorer
Joined
Jul 9, 2020
Messages
63
ok, never mind, it's still happening wit this kernel.
Maybe I should ry to go back to Ubuntu 20.04 and an earlier kernel
For what it's worth, the problem completely went away for me on Ubuntu 20.04.2 using Linux kernel 5.4.0.050400. I had multiple VMs experiencing lock-ups, and they all immediately resolved with this kernel version.

You can make the change pretty easily in grub.
 

coolnodje

Explorer
Joined
Jan 29, 2016
Messages
66

coolnodje

Explorer
Joined
Jan 29, 2016
Messages
66
I keep having CPU Soft lockup issues with Bhyve, even though I'm using one of the latest kernel 5.19.5.xxx

The issue comes back hitting once a day now.

I'm really at a loss about what to do to solve this or even to start trying to understand what is causing this. I'm trying to connect Bhyve people on IRC but I haven't been very successful.

Any idea on a strategy to try to pinpoint the root cause of this ?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
What's a CPU soft lockup?

I run multiple VMs with Ubuntu - uptime in months unless there are security relevant updates. All have in common (currently)
  • no VGA/VNC device
  • 1 VirtIO network interface
  • 1 VirtIO disk
  • 1 CPU / 2 cores / 1 thread
  • Ubuntu 20.04.4 LTS
  • all updates just as they come, current kernel: linux-generic 5.4.0.132.132

No problems whatsoever. I wouldn't run this in production if there were.
 
Last edited:

spiceygas

Explorer
Joined
Jul 9, 2020
Messages
63
I keep having CPU Soft lockup issues with Bhyve, even though I'm using one of the latest kernel 5.19.5.xxx

The issue comes back hitting once a day now.

I'm really at a loss about what to do to solve this or even to start trying to understand what is causing this. I'm trying to connect Bhyve people on IRC but I haven't been very successful.

Any idea on a strategy to try to pinpoint the root cause of this ?
This is an underwhelming response, but...

Did you try what I suggested post #38 earlier in this thread and roll-back the kernel version? I was having the exact same problem as you on multiple linux VMs. After rolling back the kernel version to 5.4.0, all of the VM freezing stopped.

I fully understand why that's not an appealing answer, but it might be worth an experiment just to check if it works. If yes, then you at least know it's something wrong between the kernel and Bhyve. And that gives you something else to report to the "Bhyve people" when you chat with them on IRC.
 
Joined
Jan 27, 2020
Messages
577
What's a CPU soft lockup?

I run multiple VMs with Ubuntu - uptime in months unless there are security relevant updates. All have in common (currently)
  • no VGA/VNC device
  • 1 VirtIO network interface
  • 1 VirtIO disk
  • 1 CPU / 2 cores / 1 thread
  • Ubuntu 20.04.4 LTS
  • all updates just as they come, current kernel: linux-generic 5.4.0.132.132

No problems whatsoever. I wouldn't run this in production if there were.
Hey Patrick, what sort of storage serves as ZVOLs for these ubuntu VM? any kind of flash I suppose? Single disk? mirror? Could you elaborate? Thank you!
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Hey Patrick, what sort of storage serves as ZVOLs for these ubuntu VM? any kind of flash I suppose? Single disk? mirror?
Mirror of NVMe SSDs. Single mirror of Samsung Evo Plus 970 at home, more spiffy 3x mirror vdev of Intel DC P4510 at work.
 

jixam

Dabbler
Joined
May 1, 2015
Messages
47
I know this will not help everyone but I fixed my lockups by avoiding CPU oversubscription. The configuration was changed from 14 vCPU with 1 core to 2 vCPU with 7 cores (physical hardware is 2 CPU with 8 cores).

Otherwise using Zvol storage, virtio disk, virtio net and VNC enabled. Experimenting with these did not solve the issue.

Basic Ubuntu 22.04 LTS server install in the VM, no kernel change.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I know this will not help everyone but I fixed my lockups by avoiding CPU oversubscription. The configuration was changed from 14 vCPU with 1 core to 2 vCPU with 7 cores (physical hardware is 2 CPU with 8 cores).
OMG! I don't use more than 4 cores for VMs ... that would never have occured to me. Thanks for reporting back.
 

FosCo

Dabbler
Joined
Sep 20, 2020
Messages
23
This seems to be still a current issue, experience with kernel 5.19.
As many others report, the Ubuntu docker VM keeps freezing up on my c3758 atom, with exactly 8 cores and 1 thread provisioned (CPU is getting bored with current load).

So overprovisioning is probably not the cause, but the Ubuntu VM keeps locking up every few days.

Does anybody have some new info to spread about this?

Edit: trying virtio and disabled vnc now, according to this thread, crossing fingers
 

FosCo

Dabbler
Joined
Sep 20, 2020
Messages
23
Lasted about 2 days, keeps locking up :(
 

apwiggins

Dabbler
Joined
Dec 23, 2016
Messages
41
Haven't used bhyve for a couple of years. Switching video to use serial port (link below), disabling vnc and switching VM's disks/video/etc to use virtio was helpful at the time.

Eventually, I switched to proxmox for virtualization and made TrueNAS a VM under proxmox. BSD/bhyve was unstable compared to Linux/kvm under proxmox in my experience. With proxmox, I did physical disk passthrough and TrueNAS imported the physical disks and ZFS volumes.
 

FosCo

Dabbler
Joined
Sep 20, 2020
Messages
23
Tried 6.2 kernel with Ubuntu Lunar and no lockup for 3 days by now - crossing fingers!

Edit: Locked up again :(
 
Last edited:
Joined
Apr 12, 2023
Messages
3
Tried 6.2 kernel with Ubuntu Lunar and no lockup for 3 days by now - crossing fingers!

Edit: Locked up again :(
Could you apply and try this patch D39620 for vmm?

BTW, what sysctl-s 'hw.vmm.vmx.cap.posted_interrupts' and 'hw.vmm.vmx.cap.virtual_interrupt_delivery' on your hosts ?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
@gusev.vitaliy, most TrueNAS users don't have the capability of patching their kernels. Please register with iX's Jira, and submit a bug report, so this patch can be incorporated into the latest nightlies for testing.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Oh Good. I was seeing this too.

Reducing contention helped, but was not ideal.

I thought things improved a bit when I switched from the generic to aws kernels.

BUT that patch actually looks like a bug and fix.
 
Top