NoVNC Broken Scale 22.12 BlueFin Beta 2?

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Not sure if anyone else is experiencing this in BETA 2. I actually tried the first BETA and had the same exact issue, except none of the VMs worked. I waited for BETA 2 and am now reporting, as some of the VMs work now...but not all Some of my VMs have a problem where I can't get to the VNC window, and some don't. It's a really odd behavior.


When I press this:

scalevms.jpg


I get this:
scale.jpg


But when I press the same button on this one:
emmett.jpg


I get to the VM:
working.jpg



Problem goes away after reverting to TrueNAS-SCALE-22.02.3
 
Last edited:

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Problem still exists in RC1, but now it's worse. I can't access the VNC monitor of any of my VMs.

New Ticket:

Old Ticket:

Was supposed to have been fixed here:

1668567581135.png
 

tprelog

Patron
Joined
Mar 2, 2016
Messages
297
I can't reach the display either

1668690447483.png


I installed my VM while on BETA 2 - I could access the display once but since I rebooted the host I get this error.

I just upgraded to RC 1 and the issue remains.

Looks like the new Jira ticket for RC 1 has already been marked closed by engineering but there is no fixed version listed.

EDIT: Ok, fixed version was listed under the BETA 2 issue and should be resolved in the release
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Found this thread after @NickF discussion on other thread.
So no evidence ist been fixed in 22.12.0?

@waqarahmed Is this replicable or do you see NoVNC working for you?
 

tprelog

Patron
Joined
Mar 2, 2016
Messages
297
There is another thread for the same issue, started on RC 1


I opened a new ticket after upgrading to the release and still not having VNC access


In my case, I can get VNC working (until I reboot SCALE) for my existing VM by restarting nginx

Forgot to mention - I can repeat this on a constant basis
 

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
confirmed this issue still exists in the release 22.12.0. Haven't tried restarting nginx yet though.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
confirmed this issue still exists in the release 22.12.0. Haven't tried restarting nginx yet though.
Just curious, I know it binds VNC to 0.0.0.0 by default. My Scale box is connected with a LACP LAG to my switch. It has several VLANS tagged/trunk and my management network is native/untagged.

I can externally connect to the VNC server for the individual vm computers.

Is yours also similar? I’m just trying to figure out why IX can’t repeat the issue on their end. Maybe nginx doesn’t like something in the way the network interface configuration is on my system and yours
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Confirmed that IX's normal VNC tests passed.

I'd suggest documenting the networking sets ups accurately on this thread. Also document what works, and what does not. I'll forward this thread to the engineering team so they can see whatever data you provide.
 

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
Just curious, I know it binds VNC to 0.0.0.0 by default. My Scale box is connected with a LACP LAG to my switch. It has several VLANS tagged/trunk and my management network is native/untagged.

I can externally connect to the VNC server for the individual vm computers.

Is yours also similar? I’m just trying to figure out why IX can’t repeat the issue on their end. Maybe nginx doesn’t like something in the way the network interface configuration is on my system and yours

I have a bridge, br0 attached to my main nic. using 2 vlans, 192.168.1.0/24 and 192.168.4.6/29. This DID work in Scale Angelfish, only broke after the upgrade.
 

Attachments

  • interfaces.jpg
    interfaces.jpg
    23.1 KB · Views: 80

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Confirmed that IX's normal VNC tests passed.

I'd suggest documenting the networking sets ups accurately on this thread. Also document what works, and what does not. I'll forward this thread to the engineering team so they can see whatever data you provide.
Hi, so, Just for clarity this is my config:
1671153756326.png


I didn't notice this but one of my interfaces is down, enp68s0f0. According to my switch it's been down about 10 days, which is about the time I was running RC1:
1671153981992.png


I shut/noshut the port and it's back. I have no idea why that interface was down??
1671154020938.png


1671154033787.png


But this seems wholly unrelated to the problem, and I am just documenting for the sake of full transparency of my troubleshooting.

In any case, I restarted nginx manually in the CLI, via
"systemctl reload nginx"
1671154105365.png


Which DID give me access into some of my VMs

1671154190721.png



But in others, I am getting 404 errors.

1671154222615.png


In the timeline of what I have seen, this is similar to the behavior I was seeing in BETA 2 before a change was made by Engineering, as documented in my original ticket. TThis is the pull request by the engineer assigned to the ticket: https://github.com/truenas/middleware/pull/10084

Since RC 1 and now 22.12.0 I was only seeing 500 errors, and not 404s.

Some further troubleshooting I did was to set the VNC address to something other than 0.0.0.0
1671154575524.png

1671154592417.png


But that does not appear to have made any impact.
 

Attachments

  • 1671154545906.png
    1671154545906.png
    17.5 KB · Views: 66
Last edited:

NickF

Guru
Joined
Jun 12, 2014
Messages
763
I can only attach 10 pictures at a time. The next thing I noticed while down this rabbit hole is that on one of the affected VMs the "Device Order" for the "Display" device and "Network" device were duplicates, they had the same value. I'm not sure why this was:
1671154849125.png


But changing that also didn't have any affect.

1671155005133.png


I then tried restarting NGINX and still the same result with that change,
1671155026653.png


@morganL I hope that helps, I wish I had more information.

Also, for the record my NIC is an Intel X710-DA2. The ticket is also updated with my current information and new debug logs.

From a networking standpoint, to further clarify,
1671156321284.png


Switch:
1671156369771.png



The IP Alias terminates directly on bond0, and I can see the MAC address of the NICs in that VLAN.
1671156577574.png

1671156535773.png
 
Last edited:

NickF

Guru
Joined
Jun 12, 2014
Messages
763
restarting nginx solved this for me...now if it survives a reboot, that's a different story, but at least it works now =)
You can probably add that as an Init/Shutdown script
1671156659029.png


Which is under System Settings -> Advanced.

Just run
systemctl reload nginx

On POSTINIT.

I still have the problem outlined above, though.
 

tprelog

Patron
Joined
Mar 2, 2016
Messages
297
Confirmed that IX's normal VNC tests passed.

I'd suggest documenting the networking sets ups accurately on this thread. Also document what works, and what does not. I'll forward this thread to the engineering team so they can see whatever data you provide.

I've also noticed that I can connect to VNC using a client (KRDC) from my desktop even while the web based access gives 502 error. My configuration for the record... Nothing complicated, just the required bridge to allow VMs to access the NAS.

1671198872363.png


VM device list
1671198939089.png


And the VM setting for Network and display
1671199011909.png
1671199051658.png
 

Attachments

  • 1671199038337.png
    1671199038337.png
    21 KB · Views: 82

TimB

Cadet
Joined
Dec 20, 2022
Messages
3
I am seeing the same behavior, too. My setup is quite basic as well, basically the same as tprelog's. Nothing fancy. Restarting nginx service does fix it, but only until a reboot, as others have noted.
 

tprelog

Patron
Joined
Mar 2, 2016
Messages
297
If restarting nginx resolves the issue until reboot, there some good news...

I did TeamViewer with an ix-system employee and they were able to fix the issue on my NAS.

Another fix will be coming in 22.12.1
 

TimB

Cadet
Joined
Dec 20, 2022
Messages
3
If restarting nginx resolves the issue until reboot, there some good news...

I did TeamViewer with an ix-system employee and they were able to fix the issue on my NAS.

Another fix will be coming in 22.12.1

Awesome.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
So progress seems to be made for some, but not for me specifically.



Per engineering, this change will be made in the next release, and you can "HotFix" it yourself by modifying a single line in a python script like the below:
Modified this file:
root@prod[/]# cd ./usr/lib/python3/dist-packages/middlewared/plugins/vm/
root@prod[.../dist-packages/middlewared/plugins/vm]# nano vm_display_info.py

Last line now reads:
return '127.0.0.1:700' instead of localhost.

You are then supposed to restart your server and it is supposed to resolve the issue. The pull request itself even states that the devs were confused as to why this resolves the problem, because it doesn't seem to make sense.

However, for me, the problem perists...

5d0bfa20-49fa-42e3-b6b8-90716b55c303.jpg


Anyone else??
 

tprelog

Patron
Joined
Mar 2, 2016
Messages
297
At this point, I'm guessing they must be different issues. Seems like fix you are referring to may only address the 502 Bad Gateway error
 
Top