Cannot boot or edit VM configuration after upgrade to 13.0-U2

cb88

Dabbler
Joined
May 11, 2022
Messages
24
Code:
root@truenas[~]# midclt call vm.start 2
'pfSenese'
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 139, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1246, in _call
    return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1151, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 979, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/vm.py", line 1598, in start
    self.vms[vm['name']].start(vm_data=vm)
KeyError: 'pfSenese'
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, I see the problem. There's a non-printing character at the end of the VM name. This is confusing the middleware in 13. Reboot back to 12, and rename VM 2 without the non-printing character.

The error report in post 21 shows the KeyError: 'pfSenese ' <- note extra space between the end of the VM name and the single quote.
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
Ok lets give this a shot... I did rename the VM in 12 earlier but maybe I retained that character.
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
OK, I see the problem. There's a non-printing character at the end of the VM name. This is confusing the middleware in 13. Reboot back to 12, and rename VM 2 without the non-printing character.

The error report in post 21 shows the KeyError: 'pfSenese ' <- note extra space between the end of the VM name and the single quote.
Actually there is no extra space look back at post 21... your eyes are playing tricks on you. And apparently instead of copying the line from post 21 you retyped it in your post with the imaginary space. Anyway I already retyped the VM name and did a fresh update from 12.0-U8.1 to 13.0-U2 but I fear that was pointless.
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
Yeah definitely no extra space. I rebooted into 13.0-U2 and the problem persists and there is no extra space in 12 or 13 shown anywhere.
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
I also use tunables to load the vmm_load kernel module, and to passthrough my NIC PCI devices to the VM, and everything worked out of the box when upgrading to TrueNAS 13, PCI addresses didn't change.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
OK, I apologize for the misidentification. There's something funny going on with that VM's entry in the configuration database. The errors KeyError and Unable to locate domain indicate a problem in the configuration database file /data/freenas-v1.db table vm_vm. Can you run sqlite3 /data/freenas-v1.db, and then SELECT * from vm_vm;, and report the output?
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
OK, I apologize for the misidentification. There's something funny going on with that VM's entry in the configuration database. The errors KeyError and Unable to locate domain indicate a problem in the configuration database file /data/freenas-v1.db table vm_vm. Can you run sqlite3 /data/freenas-v1.db, and then SELECT * from vm_vm;, and report the output?
1662476737792.png


Unfortunately they closed the Jira ticket even without resolving it... note the VM boots with PCIe devices removed after a reboot, but it fails to be able to add PCIe passthrough devices if I try to add them back.

with:
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/vm.py", line 2228, in get_iommu_type
raise CallError(f'Failed to check support for iommu ({key}): {sp.stderr.decode()}')
middlewared.service_exception.CallError: [EFAULT] Failed to check support for iommu (VT-d): acpidump: DSDT is corrupt

Honestly not sure why ACPI or DSDT is being relied on for this information as it is wrong in many BIO implementations, it should rather be probing all information it can instead. The kernel itself doesn't rely on ACPI to determine this AFAIK.
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Wow, today I learned that the TrueNAS GUI supports PCI passthrough devices! When was this feature added? I've been managing my pfSense router VM entirely from the command line as a raw bhyve guest for years out of lack of support for this feature! Perhaps now I can migrate to something a little bit more robust…? (yeah, not a hardware router just yet ;)
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
Wow, today I learned that the TrueNAS GUI supports PCI passthrough devices! When was this feature added? I've been managing my pfSense router VM entirely from the command line as a raw bhyve guest for years out of lack of support for this feature! Perhaps now I can migrate to something a little bit more robust…? (yeah, not a hardware router just yet ;)
Perhaps if they fix relying on the DSDT tables otherwise... it has a moderately high chance of being broken out of the box on many machines due to bad tables.
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Yeah, your current experience is a big warning sign that I shouldn't attempt to use the feature just yet. This virtualized router that I run on my TrueNAS server is my home's actual router, everything goes through it, hence the need for PCI passthrough for the NICs. Moving it to something broken would be terribly disruptive.

But for a while I have indeed wanted to either overhaul my management scripts, or move my setup to something a little bit more robust on various fronts, and having TrueNAS (working) support for PCI passthrough would fit that bill, for the time being at least. Moving the router to a box of its own, which is the ultimate goal, is at the tail end of a full-blown 10Gb/WiFi6E all out migration for my home network, which is expensive!
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
Yeah also ... it works perfectly on 12-U8.1 so something changed with 13 to break it.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
View attachment 58241

Unfortunately they closed the Jira ticket even without resolving it... note the VM boots with PCIe devices removed after a reboot, but it fails to be able to add PCIe passthrough devices if I try to add them back.

with:
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/vm.py", line 2228, in get_iommu_type
raise CallError(f'Failed to check support for iommu ({key}): {sp.stderr.decode()}')
middlewared.service_exception.CallError: [EFAULT] Failed to check support for iommu (VT-d): acpidump: DSDT is corrupt

Honestly not sure why ACPI or DSDT is being relied on for this information as it is wrong in many BIO implementations, it should rather be probing all information it can instead. The kernel itself doesn't rely on ACPI to determine this AFAIK.

That's what I thought. You have a mismatch between the UI and the DB. The DB thinks your VM name is pfSense, but the UI thinks it's pfSenese (2nd column after the VM ID).

I'm not in front of my system currently, but I did mock up the SQL commands to fix this. When I get home tonight, I'll post the specific SQL commands to fix this. We'll need to make the DB match the UI, so the UI will be able to resolve the KeyError.
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
That's what I thought. You have a mismatch between the UI and the DB. The DB thinks your VM name is pfSense, but the UI thinks it's pfSenese (2nd column after the VM ID).

I'm not in front of my system currently, but I did mock up the SQL commands to fix this. When I get home tonight, I'll post the specific SQL commands to fix this. We'll need to make the DB match the UI, so the UI will be able to resolve the KeyError.
No... we changed the name a few troubleshooting steps back if you check the more recent error messages I posted it has been trying to start 'pfSense' and failing in the error messages. I attempted to ensure that all the correct settings matched between the 12 and 13 boot environment by just deleting my 13 boot env and re upgrading this ended up resulting in the same error.

Ultimately removing PCIe passthroguh and rebooting got it booting again... but now I can't add my PCIe devices back.

Thanks.
1662484322506.png
 

cb88

Dabbler
Joined
May 11, 2022
Messages
24
The PiHole VM ends up inaccesaible from the GUI also after awhile I get this after tying to start it (its set to autostart so should already be running, and is running after a fresh reboot... after awhile it ends up like this).

1662485857723.png
 

HazardousHD

Cadet
Joined
May 4, 2022
Messages
3
No... we changed the name a few troubleshooting steps back if you check the more recent error messages I posted it has been trying to start 'pfSense' and failing in the error messages. I attempted to ensure that all the correct settings matched between the 12 and 13 boot environment by just deleting my 13 boot env and re upgrading this ended up resulting in the same error.

Ultimately removing PCIe passthroguh and rebooting got it booting again... but now I can't add my PCIe devices back.

Thanks.
View attachment 58242
I am also having this exact same issue. I have a W10 VM I used to pass USB devices into and now it cannot boot with the US PCIE device passed in. It boots perfectly without it.

I need this to work.. Did you come to a resolution or did you just downgrade back to U12?
If you did downgrade, are there any good instructions to do so.. I would like to get my VM running again as U13 has broken it
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
I am also having this exact same issue. I have a W10 VM I used to pass USB devices into and now it cannot boot with the US PCIE device passed in. It boots perfectly without it.

I need this to work.. Did you come to a resolution or did you just downgrade back to U12?
If you did downgrade, are there any good instructions to do so.. I would like to get my VM running again as U13 has broken it
I am running a similar setup successfully under TrueNAS 13.0-U2, with PCI passthrough for three NICs, but all of it from the command-line, skipping the UI (not quite) entirely.

It relies on somewhat clunky scripts I wrote to operate bhyve and novnc, which I've been meaning to overhaul for a while already, but at least I can say they've kept my VM (my home's pfSense router) up-and-running non-stop for a few years already quite successfully.

Let me know if you'd like more info on my setup and my scripts, as this'd be the opportunity I've been looking for to improve my setup.
 

HazardousHD

Cadet
Joined
May 4, 2022
Messages
3
Did they fix PCIe passthrough in 13-U2?
I'm currently experiencing errors like "[EFAULT] Failed to check support for iommu (VT-d): acpidump: DSDT is corrupt" when I even try to pass through a device
 

Juan Manuel Palacios

Contributor
Joined
May 29, 2017
Messages
146
Did they fix PCIe passthrough in 13-U2?
I'm currently experiencing errors like "[EFAULT] Failed to check support for iommu (VT-d): acpidump: DSDT is corrupt" when I even try to pass through a device
Not to my knowledge. But I can't really say because, as I explained above, my bhyve VM with PCI passthrough is not managed by the TrueNAS UI or middleware in any way, it's fully managed via command-line scripts that I wrote. That is to say, as far as the TrueNAS UI and database is concerned, my pfSense VM doesn't even exist.

The only things I do through the TrueNAS UI for that VM, and that's just to reduce deployment complexity because I could also handle them through the command-line, are:
  1. Loading the "vmm" kernel module through a System Tunable.
  2. Configuring my PCI devices of interest for passthrough via another System Tunable.
  3. Starting & stopping the VM via Init/Shutdown Script entries that call out to my corresponding scripts.
So in none of that do I rely on the support by the TrueNAS middleware for PCI passthrough, I'm tapping the FreeBSD layer directly for that. That being said, and for the sake of continuing to reduce deployment complexity, I would very much like to migrate my pfSense VM to the TrueNAS middleware once support for PCI passthrough is fixed in 13.0.
 
Top