NVIDIA GPU drivers aren't loading

silkky

Cadet
Joined
Aug 21, 2023
Messages
1
After I upgraded to the Cobia beta, the NVIDIA gpu drivers aren't being loaded, it's using the nouveau drivers instead.

From the logs it's looks like nouveau isn't blacklisted, is there a fix I can apply for this or will I just have to wait for the next release?

$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.


$ lspci -v

01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. GP106 [GeForce GTX 1060 6GB]
Flags: bus master, fast devsel, latency 0, IRQ 137, IOMMU group 2
Memory at de000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (64-bit, prefetchable) [size=256M]
Memory at d0000000 (64-bit, prefetchable) [size=32M]
I/O ports at e000 [size=128]
Expansion ROM at df000000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: nouveau
Kernel modules: nouveau, nvidia_current_drm, nvidia_current



$ modprobe nvidia_current_drm
modprobe: ERROR: could not insert 'nvidia_current_drm': No such device


Some relevant entries in /var/log/messages:
$ cat /var/log/messages
kernel: nouveau 0000:01:00.0: enabling device (0000 -> 0003)
kernel: nouveau 0000:01:00.0: NVIDIA GP106 (136000a1)
...

kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 238
kernel: NVRM: This can occur when a driver such as:
NVRM: nouveau, rivafb, nvidiafb or rivatv
NVRM: was loaded and obtained ownership of the NVIDIA device(s).
kernel: NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
kernel: NVRM: No NVIDIA devices probed.
kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 238
kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 238
kernel: NVRM: This can occur when a driver such as:
NVRM: nouveau, rivafb, nvidiafb or rivatv
NVRM: was loaded and obtained ownership of the NVIDIA device(s).
kernel: NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
kernel: NVRM: No NVIDIA devices probed.
kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 238
 

ABain

Bug Conductor
iXsystems
Joined
Aug 18, 2023
Messages
172
A fix will be available in RC.1, take a look at the ticket linked below as it has information on a workaround to get the drivers working in the comments.
NAS-123554
 
Top