TrueNAS 12 Bhyve, Pool, or Jail causing panic on boot (most of the time)

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
I built a truenas server for our non-profit, see build details below. At some point the server had trouble booting, I now think it might be related after adding a VM or Jail. Most of the time the kernel panics on boot after being unable to start "ntpd". (see screenshots) I replaced the power supply thinking maybe it was a power issue, issue persists. I turned off AutoStart for the jail the last time it was booted successfully, so I am not sure it is jail related - although I had several issues when creating the jail or starting it it would cause a panic.

I need help troubleshooting with a few things...
1) How do turn off my vm (bhyve) from auto starting from single user mode within boot options? - at the moment I haven't been lucky today to get a good boot.
2) How can I turn on capturing the core dumps or panic error? The screen is almost refreshing too fast. (see screenshots below)

FYI - I got a two USB drives, created new install media with 12.0-U2.1 and used the other as a boot media. When I reinstalled Freenas it worked fine, rebooted like 20 times, no issue. When I restored my configuration, issue came back. So it is either the VM, Jail, Pool issue, maybe not hardware at this point since new install worked fine.

Truenas Version TrueNAS-12.0-U2.1 (Issue was replicated on most of 12 version if not all, and on 11.x

SuperMicro X9SCL+/X9SCM
E3-1230 V2 @ 3.30 GHZ
32GB RAM
PNY 128/258 - Boot Device
6x 6TB IronWolf - Storage
 

Attachments

  • screen1.PNG
    screen1.PNG
    3.2 MB · Views: 204
  • screen2.PNG
    screen2.PNG
    3.9 MB · Views: 194
  • screen3.PNG
    screen3.PNG
    3.7 MB · Views: 174

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
1) How do turn off my vm (bhyve) from auto starting from single user mode within boot options? - at the moment I haven't been lucky today to get a good boot.

midclt call vm.update '{"id":<VM ID>, "autostart": false}', where ID is the integer index associated with the VM. You can get this via midclt call vm.query | jq.

To stop the VM if running, run midclt call vm.stop <VM ID>.

2) How can I turn on capturing the core dumps or panic error? The screen is almost refreshing too fast. (see screenshots below)

Core dumps are in /var/db/system/cores. You can also get the console logs at /var/log/console.log*.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
Thank you Samuel, let me pull some more information from the system and see if I can see what is causing it. I will post additional info back later today.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
Here is what I get when I load into single user mode from boot options...

1) "midcalt call vm.query | jq"

Failed to run middleware call. Daemon not running? - How do I start the Daemon?

2) There doesn't seem to be a /var/log directory, any thoughts on where the console.log is? There is also no cores directory under /var/db/system...

Please advise, sorry if my questions are newbieish....
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
I get middlewared does not exist in /etc/rc.d or the local startup directories (/etc/ix.rc.d /usr/local/etc/rc.d), or is not executable

for the logs, the directly /var/db/system/samba4/private

No files under system, samba4, or private.... nothing with syslog...
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, I had to try a single user boot myself to see some of your difficulties. I've not yet figured out how to start the middleware yet, but I have figured out how to mount the logs.
  1. After booting into single-user, you'll need to prepare the boot pool to host additional mount points. The root file system is mounted read-only, so you'll have to change it to read-write to have successful mounts afterwards:

    zfs set readonly=off /

  2. At this point, you can mount your data pool. Assuming your pool is named tank, run: zpool import -f -R /mnt tank.
  3. From the behavior described above, you've set your system dataset to reside in your data pool. Run the following to mount all the parts of the system dataset:

    Code:
    mount -t zfs tank/.system /var/db/systemmount -t zfs tank/.system/cores /var/db/system/cores
    mount -t zfs tank/.system/samba4 /var/db/system/samba4
    mount -t zfs tank/.system/syslog-<system-specific string> /var/db/system/syslog-<system-specific string>
    mount -t zfs tank/.system/configs-<system-specific string> /var/db/system/configs-<system-specific string>
    mount -t zfs tank/.system/rrd-<system-specific string> /var/db/system/rrd-<system-specific string>
    mount -t zfs tank/.system/webui /var/db/system/webui
    mount -t zfs tank/.system/services /var/db/system/services
    
Now you should be able to view the logs under /var/db/system/syslog-<system-specific string>/log.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
First, thank you so much for your help....

When I run the first command "zfs set readonly=off /"

cannot open '/' : leading slash in name
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, try zfs set readonly=off freenas-boot/ROOT/12.0-U2.1.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
I was able to get the following command to work...

zpool status

The name of the pool is "freenas-boot"

then i used "zfs set readonly=off freenas-boot"

That seemed to work maybe....
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
Then I was going to import -f <main pool name>

how do I find out the name of the main boot pool? on boot, it might list it, will reboot and see if I see it...
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
When I rebooted, it listed RGC_STORAGE as my boot pool but when I run the following import command, the one you provided and another i get the following...

cannot import 'RGC_STORAGE' : no such pool available

any thoughts on how I can confirm my pool name?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
zpool import just by itself should show the names of candidate pools to import.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
After looking over your screen shots, I suspect your boot thumb drive has become corrupted. Instead of trying to chase down the exact fault, it may be quicker to just reinstall to a new boot disk. Note, thumb drives are no longer recommended as boot media; a small SSD is now best practice. After you've achieved a working installation, import your pool from the GUI, and then restore your config.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
So the original issue occurred on my mirrored PYN SSD 256GB drives, thinking it could be corrupted I grabbed a USB drive to install fresh TRUENAS on to troubleshoot. That worked fine until I import the configuration, the issue returns.

What do you suggest to troubleshoot the issue, I think it could be related to the vm or jail I have.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
If your saved config is corrupt or truncated, that could explain what's going on. After installation, you may want to manually mount the tank/.system/configs-<system-specific string> dataset while in single-user. In that directory are the daily saved configuration databases. Try picking a backup that's earlier than when you created the VM or the jail, and copy it to /data/freenas-v1.db. Then reboot to see if you're back in business.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
I re-installed TRUENAS on usb thumb drive (i know not recommended) but for troubleshooting I think it will be helpful. I reinstalled latest version of TRUENAS and imported my pool using the GUI successfully. Restarted a number of times and everything works well with the pool - no panics, I don't think pool is the issue. I have my 1 jail turned off, and 1 vm turned off. I re-created the vm and re-attached the exiting virtual disk of my VM and as soon as I start the VM I get the same panic as before. The Jail seems to start fine without panic.

So the issue is with the VM? looking for help on process to troubleshoot, how do we get the error message in text or dump files?

The VM is a windows server so I do need to rescue it or get it running, so looking for advice on how to troubleshoot the panic with the vm.
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
When I start the VM I am able to see windows loading screen in VNC and then shortly after I get the following as the error message. It took a little while to reboot so I think it core dumped.
IMG_5897.jpg
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
Ok one additional plot twist, I was creating a SMB share to get the core dumps off the server and when I started the SMB service I get very similar panic so the issue doesn't seem to be exclusive to the VM. Do I have bios settings incorrect on the SuperMicro?

Please help me understand the cause of the panic, my hardware is below...

SuperMicro X9SCL+/X9SCM
Xeon E3-1230 V2 @ 3.30 GHZ
32GB RAM (SuperTalent 8GB W1333EB8GM DDR3-1333 PC3-10600 ECC)
PNY 256GB - CS900 - Boot Device
6x 6TB IronWolf - Storage
HBA Card SAS9207-8i
 

dfalke

Dabbler
Joined
Mar 12, 2021
Messages
31
Attached is from the \data\crash directory for when I started the SMB share, latest panic.
 

Attachments

  • config.txt
    4 KB · Views: 158
  • ddb.txt
    688 KB · Views: 191
  • msgbuf.txt
    74 KB · Views: 222
  • panic.txt
    10 bytes · Views: 159
  • version.txt
    50 bytes · Views: 211
Top