NoVNC Broken Scale 22.12 BlueFin Beta 2?

Sparx

Contributor
Joined
Apr 18, 2017
Messages
107
So I also have an issue with VNC on the VMs. But when I removed the Nvidia graphics Passthrough device the VNC works.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
@Sparx That for sure has to be the problem. I am still having the issue, and I have been able to reproduce it on 3 seperate additional systems...and eahc of those systems has an NVidia card passed through to a VM. But it doesn't just break that VM. It breaks some of the other VMS also.

@morganL The ticket is closed, Can you have someone open it back up?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
@Sparx That for sure has to be the problem. I am still having the issue, and I have been able to reproduce it on 3 seperate additional systems...and eahc of those systems has an NVidia card passed through to a VM. But it doesn't just break that VM. It breaks some of the other VMS also.

@morganL The ticket is closed, Can you have someone open it back up?

Now that we know its related to NVIDIA pass-thru, I think a separate ticket is worthwhile. It may be a different engineer will be assigned and we certainly need a different test system.
 

Sparx

Contributor
Joined
Apr 18, 2017
Messages
107
And in my system the Nvidia driver doesnt work at all. Maybe its related to this. What is stopping the driver from working is also related to the VNC.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
And in my system the Nvidia driver doesnt work at all. Maybe its related to this. What is stopping the driver from working is also related to the VNC.

I’m using my gpus to pass through to a Vm, do you mean you are having a problem whereby you can’t use it in a kubernetes app on scale itself? I’ll do more testing on my end if that’s the case, I hadn’t tried to use it for an app.

The driver in the guest os has nothing to do with scale if you are talking about a Vm, which is why I am asking
 

Sparx

Contributor
Joined
Apr 18, 2017
Messages
107
I’m using my gpus to pass through to a Vm, do you mean you are having a problem whereby you can’t use it in a kubernetes app on scale itself? I’ll do more testing on my end if that’s the case, I hadn’t tried to use it for an app.

The driver in the guest os has nothing to do with scale if you are talking about a Vm, which is why I am asking
Yeah I have tested different ways. Maybe shouldnt kidnap this thread with my Nvidia issues now. But nvidia-smi doesnt find any card and states it cant communicate with the driver. Both with GPU isolated and not. When i try to start the VM with a GPU passthrough i cant really tell if anything happens or if the VM loads at all since the VNC doesnt work. But I never get the IP up where the VM should be, so I guess it never starts. If I unbind the GPU passthrough VNC works and I get the VM up and running. But then without the GPU naturally. I cant get the GPU to work in any app since it doesnt get any proper driver I guess.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Yeah I have tested different ways. Maybe shouldnt kidnap this thread with my Nvidia issues now. But nvidia-smi doesnt find any card and states it cant communicate with the driver. Both with GPU isolated and not. When i try to start the VM with a GPU passthrough i cant really tell if anything happens or if the VM loads at all since the VNC doesnt work. But I never get the IP up where the VM should be, so I guess it never starts. If I unbind the GPU passthrough VNC works and I get the VM up and running. But then without the GPU naturally. I cant get the GPU to work in any app since it doesnt get any proper driver I guess.

You should be able to get into the VM using a standalone VNC client. It's just the built in web-based one that's broken. I do not recommend using VNC after setup is completed, and you should transition to RDP, SSH or something else more secure.

But anyway, I was able to get in using this: https://www.realvnc.com/en/connect/download/viewer/

The VM status page on SCALE will tell you what port VNC is listening on for a particular VM.
1677357981888.png


Then add it to VNC, with the management IP of your SCALE host and that port number:
1677358024296.png


It should work for initial setup now. For me, even this will not work unless I have the VNC console open BEFORE I start the VM. If I try to use it after the VM is already started it looks like this and never loads:

1677358150821.png


Before I open the new ticket, let me know if you can reproduce the same behavior if you can.

But after I setup my VM and installed the NVIDIA drivers, I am able to use RDP reliably to access my VM whenever after the initial setup. If you are using Windows make sure you install the latest Virtio drivers AND the guest agent (both installers in the same ISO) https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/archive-virtio/?C=M;O=D

1677358285357.png



What's weird is that sometimes the VNC console does work for me. In the example above its working right now. It may continue to work, but it also may stop working if I restart EITHER the host OR the guest:
1677358525112.png


But on another host, with the EXACT same hardware configuration and a CLONE (ZFS Send of the hard drive's ZVOL to this host) of the same VM with the only difference being it's hostname, I get this error. I cannot explain the inconsistency, and I have been racking my brain on this for weeks. I didn't even put 2 and 2 together with the NVidia passthru until I saw your post, so you have definitely helped me make progress here in understanding what's going on...

1677358642103.png


Hardware tab for both VMs are exactly the same:
1677358683709.png


Same device IDs are passed:
1677358743946.png

1677358715223.png


And both of them have the GPU isolation set:
1677358782417.png
 
Last edited:

Sparx

Contributor
Joined
Apr 18, 2017
Messages
107
Yep. Gave that a go now. And it looks the same. Without the GPU the VM boots. And with GPU passthrough it shows "Guest has not initialized the display yet".
 

Attachments

  • asdf.png
    asdf.png
    46.7 KB · Views: 139

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Yep. Gave that a go now. And it looks the same. Without the GPU the VM boots. And with GPU passthrough it shows "Guest has not initialized the display yet".
What card are you trying to use? It may be a combination of the problems I am having as well as hardware support on your specific card.

In VMWare ESXI, Proxmox and SCALE I have had some cards which don't work properly. One of the main behaviors I have seen is that the cards don't always reset properly if the guest OS they are attached to is restarted and the HOST virtualization server does not. I think I remember seeing a prompt in SCALE may warning about this now, but I could be misremembering.

What that means is when the host server is physically turned on for the first time, the passthrough and card worked the first time the guest OS was started, but if he guest updated and restarted it broke. I saw this behavior with some GeForce and Radeon cards. I have a Geforce 1650 in my main server and it does not have this issue, and any Quadro cards I have tested also don't.
 

Sparx

Contributor
Joined
Apr 18, 2017
Messages
107
Its an Nvidia Tesla V100. It worked on 22.02 without any major issues as i recall. I think its more suited for VM workloads than most "normal" GPUs.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763

Isma

Contributor
Joined
Apr 29, 2020
Messages
100
I add to the problem, clean reinstallation
TrueNAS-SCALE-22.12.2 and I have errors

but when I upgraded from a lower version to this new one, the vms worked correctly, the funny thing is that when you don't add a gpu to a vm from its creation, then from the options it doesn't let you add it, when saving the gpu doesn't stay selected and you have to add manually from devices the pci id
 
Top