NoVNC Broken Scale 22.12 BlueFin Beta 2?

NickF · Oct 22, 2022

Not sure if anyone else is experiencing this in BETA 2. I actually tried the first BETA and had the same exact issue, except none of the VMs worked. I waited for BETA 2 and am now reporting, as some of the VMs work now...but not all Some of my VMs have a problem where I can't get to the VNC window, and some don't. It's a really odd behavior.

[NAS-118691] - iXsystems TrueNAS Jira

ixsystems.atlassian.net

When I press this:

I get this:

But when I press the same button on this one:

I get to the VM:

Problem goes away after reverting to TrueNAS-SCALE-22.02.3

NickF · Nov 15, 2022

Problem still exists in RC1, but now it's worse. I can't access the VNC monitor of any of my VMs.

New Ticket:

[NAS-119021] - iXsystems Jira

ixsystems.atlassian.net

Old Ticket:

[NAS-118691] - iXsystems TrueNAS Jira

ixsystems.atlassian.net

Was supposed to have been fixed here:

NAS-118691 / 22.12 / Fix race condition for display devices by Qubad786 · Pull Request #10084 · truenas/middleware

Context We use haproxy for reverse proxy to the VNC/SPICE clients for active vms and while starting/stopping a VM we reload haproxy to account for active display devices. Problem We saw a user for ...

github.com

tprelog · Nov 17, 2022

I can't reach the display either

I installed my VM while on BETA 2 - I could access the display once but since I rebooted the host I get this error.

I just upgraded to RC 1 and the issue remains.

Looks like the new Jira ticket for RC 1 has already been marked closed by engineering but there is no fixed version listed.

EDIT: Ok, fixed version was listed under the BETA 2 issue and should be resolved in the release

morganL · Dec 14, 2022

Found this thread after @NickF discussion on other thread.
So no evidence ist been fixed in 22.12.0?

@waqarahmed Is this replicable or do you see NoVNC working for you?

tprelog · Dec 15, 2022

There is another thread for the same issue, started on RC 1

Cannot Display Virtual Machine

On a fresh install of TrueNAS-SCALE-22.12-RC.1, I cannot get "Display" to work on a VM. I set up the VM fresh and then click on Display and get a 502 error. Any ideas? I've tried VNC and Spice: Thanks!

www.truenas.com

I opened a new ticket after upgrading to the release and still not having VNC access

[NAS-119366] - iXsystems TrueNAS Jira

ixsystems.atlassian.net

In my case, I can get VNC working (until I reboot SCALE) for my existing VM by restarting nginx

Forgot to mention - I can repeat this on a constant basis

Evan Richardson · Dec 15, 2022

confirmed this issue still exists in the release 22.12.0. Haven't tried restarting nginx yet though.

NickF · Dec 15, 2022

Evan Richardson said:
confirmed this issue still exists in the release 22.12.0. Haven't tried restarting nginx yet though.

Just curious, I know it binds VNC to 0.0.0.0 by default. My Scale box is connected with a LACP LAG to my switch. It has several VLANS tagged/trunk and my management network is native/untagged.

I can externally connect to the VNC server for the individual vm computers.

Is yours also similar? I’m just trying to figure out why IX can’t repeat the issue on their end. Maybe nginx doesn’t like something in the way the network interface configuration is on my system and yours

morganL · Dec 15, 2022

Confirmed that IX's normal VNC tests passed.

I'd suggest documenting the networking sets ups accurately on this thread. Also document what works, and what does not. I'll forward this thread to the engineering team so they can see whatever data you provide.

Evan Richardson · Dec 15, 2022

NickF said:
Just curious, I know it binds VNC to 0.0.0.0 by default. My Scale box is connected with a LACP LAG to my switch. It has several VLANS tagged/trunk and my management network is native/untagged.

I can externally connect to the VNC server for the individual vm computers.

Is yours also similar? I’m just trying to figure out why IX can’t repeat the issue on their end. Maybe nginx doesn’t like something in the way the network interface configuration is on my system and yours

I have a bridge, br0 attached to my main nic. using 2 vlans, 192.168.1.0/24 and 192.168.4.6/29. This DID work in Scale Angelfish, only broke after the upgrade.

NickF · Dec 15, 2022

morganL said:
Confirmed that IX's normal VNC tests passed.

I'd suggest documenting the networking sets ups accurately on this thread. Also document what works, and what does not. I'll forward this thread to the engineering team so they can see whatever data you provide.

Hi, so, Just for clarity this is my config:

I didn't notice this but one of my interfaces is down, enp68s0f0. According to my switch it's been down about 10 days, which is about the time I was running RC1:

I shut/noshut the port and it's back. I have no idea why that interface was down??

But this seems wholly unrelated to the problem, and I am just documenting for the sake of full transparency of my troubleshooting.

In any case, I restarted nginx manually in the CLI, via
"systemctl reload nginx"

Which DID give me access into some of my VMs

But in others, I am getting 404 errors.

In the timeline of what I have seen, this is similar to the behavior I was seeing in BETA 2 before a change was made by Engineering, as documented in my original ticket. TThis is the pull request by the engineer assigned to the ticket: https://github.com/truenas/middleware/pull/10084

Since RC 1 and now 22.12.0 I was only seeing 500 errors, and not 404s.

Some further troubleshooting I did was to set the VNC address to something other than 0.0.0.0

But that does not appear to have made any impact.

NickF · Dec 15, 2022

I can only attach 10 pictures at a time. The next thing I noticed while down this rabbit hole is that on one of the affected VMs the "Device Order" for the "Display" device and "Network" device were duplicates, they had the same value. I'm not sure why this was:

But changing that also didn't have any affect.

I then tried restarting NGINX and still the same result with that change,

@morganL I hope that helps, I wish I had more information.

Also, for the record my NIC is an Intel X710-DA2. The ticket is also updated with my current information and new debug logs.

From a networking standpoint, to further clarify,

Switch:

The IP Alias terminates directly on bond0, and I can see the MAC address of the NICs in that VLAN.

Evan Richardson · Dec 15, 2022

restarting nginx solved this for me...now if it survives a reboot, that's a different story, but at least it works now =)

NickF · Dec 15, 2022

Evan Richardson said:
restarting nginx solved this for me...now if it survives a reboot, that's a different story, but at least it works now =)

You can probably add that as an Init/Shutdown script

Which is under System Settings -> Advanced.

Just run
systemctl reload nginx

On POSTINIT.

I still have the problem outlined above, though.

tprelog · Dec 16, 2022

morganL said:
Confirmed that IX's normal VNC tests passed.

I'd suggest documenting the networking sets ups accurately on this thread. Also document what works, and what does not. I'll forward this thread to the engineering team so they can see whatever data you provide.

I've also noticed that I can connect to VNC using a client (KRDC) from my desktop even while the web based access gives 502 error. My configuration for the record... Nothing complicated, just the required bridge to allow VMs to access the NAS.

VM device list

And the VM setting for Network and display

TimB · Dec 20, 2022

I am seeing the same behavior, too. My setup is quite basic as well, basically the same as tprelog's. Nothing fancy. Restarting nginx service does fix it, but only until a reboot, as others have noted.

tprelog · Dec 20, 2022

If restarting nginx resolves the issue until reboot, there some good news...

I did TeamViewer with an ix-system employee and they were able to fix the issue on my NAS.

Another fix will be coming in 22.12.1

TimB · Dec 20, 2022

tprelog said:
If restarting nginx resolves the issue until reboot, there some good news...

I did TeamViewer with an ix-system employee and they were able to fix the issue on my NAS.

Another fix will be coming in 22.12.1

Awesome.

morganL · Dec 20, 2022

tprelog said:
If restarting nginx resolves the issue until reboot, there some good news...

I did TeamViewer with an ix-system employee and they were able to fix the issue on my NAS.

Another fix will be coming in 22.12.1

Tracked here: https://ixsystems.atlassian.net/browse/NAS-119366

NickF · Dec 30, 2022

So progress seems to be made for some, but not for me specifically.

[NAS-119021] - iXsystems Jira

ixsystems.atlassian.net

Per engineering, this change will be made in the next release, and you can "HotFix" it yourself by modifying a single line in a python script like the below:

Modified this file:
root@prod[/]# cd ./usr/lib/python3/dist-packages/middlewared/plugins/vm/
root@prod[.../dist-packages/middlewared/plugins/vm]# nano vm_display_info.py

Last line now reads:
return '127.0.0.1:700' instead of localhost.

You are then supposed to restart your server and it is supposed to resolve the issue. The pull request itself even states that the devs were confused as to why this resolves the problem, because it doesn't seem to make sense.

However, for me, the problem perists...

Anyone else??

tprelog · Dec 30, 2022

At this point, I'm guessing they must be different issues. Seems like fix you are referring to may only address the 502 Bad Gateway error

Important Announcement for the TrueNAS Community.

NoVNC Broken Scale 22.12 BlueFin Beta 2?

Guru

Guru

Patron

Captain Morgan

Patron

Explorer

Guru

Captain Morgan

Explorer

Attachments

Guru

Attachments

Guru

Explorer

Guru

Patron

Attachments

Cadet

Patron

Cadet

Captain Morgan

Guru

Patron

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "NoVNC Broken Scale 22.12 BlueFin Beta 2?"

Similar threads