VM Frequently Crashing

nhabbott

Dabbler
Joined
Jan 23, 2023
Messages
10
I am running DW Spectrum VMS in a VM within TrueNAS Core. I am having an issue where the VM will, seemingly randomly, have all cores jump to 100% utilization and then crash or crash with no indication that it is having problems. I can't get it to stay online longer than an hour. During this, the TrueNAS dashboard and OS remain 100% responsive. The only way to fix it is to hard reboot the VM. I have tried changing the NIC & Disk drivers to VirtIO, changing the VM specs to keep from allocating more resources than it needs, and moving the VM to its own pool of two SSDs in a mirror. When looking through Ubuntu's /var/crash, sometimes a log file is created stating that one of the CPU cores was stuck for X milliseconds. To eliminate hardware as an issue, I have replicated this issue on two different machines with identical specs. I have seen others experiencing similar VM stability issues, but none of the fixes I found worked. At this point, I do not know what else to check.

Hardware
TrueNAS Core Version
: 13.0-U3
CPU: 1 Intel Xeon Silver 4216
RAM: 64GB DDR4 ECC
Motherboard: Supermicro X11SPL-F
HDDs: 14 WDC WD181PURP-86 (Video Pool)
SSDs: 2 PNY CS900 (VM Pool), 2 HDSTOR HSAV25ST250AX (Boot Pool)
HBA: 2 LSI 9305

Pools
Video_Pool
: 2 RAIDZ2 vDevs consisting of 7 18TB HDDs
VM_Pool: Mirror consisting of 2 250GB SSDs.

OS: Ubuntu 22.04.5
vCPUs: 1
Cores: 4
Threads: 2
RAM: 20 GiB
Devices (Order): NIC (1002), Disk - Boot (1000), Disk - Video Pool (1001), VNC (1009)
 

nhabbott

Dabbler
Joined
Jan 23, 2023
Messages
10
Here is a screenshot of one of the crash logs
Crash Log (012323-2113).png
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
Interesting that the VM thinks it has 9 (or maybe 10) CPUs...

According to your post, it should only have 8, so indeed CPU 9 or 10 will respond slowly.

I would suggest running with only 2 cores, 2 threads and see if you get stability.
 

nhabbott

Dabbler
Joined
Jan 23, 2023
Messages
10
Interesting that the VM thinks it has 9 (or maybe 10) CPUs...

According to your post, it should only have 8, so indeed CPU 9 or 10 will respond slowly.

I would suggest running with only 2 cores, 2 threads and see if you get stability.

So, I get noticeably better performance but the VMS maxes out the CPU. I also tried 6 cores with 1 thread, but it also maxes the CPU. It is no longer crashing but the performance graphs are all over the place. Nothing is steady unless it's maxed out.

Does TrueNAS having an IP on the same adapter that the VM is using cause any issues that you know of?

This is another issue (forgive me for highjacking my own thread but I believe this may be part of my issue. I was trying to change IPs in the TrueNAS dashboard but it doesn't take the changes. For example, if I set a static and disable DHCP, it takes the static. However, it also re-enables DHCP and somehow ends up with both the static IP and DHCP IP while the dashboard only shows the adapter as being set up for DHCP.
 
Top