Update Nvidia driver

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Well, it didn't immediately set my system on fire:

Code:
admin@cobia01dev[~]$ nvidia-smi
Thu Oct 26 14:19:27 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       Off | 00000000:13:00.0 Off |                  Off |
| N/A   36C    P0              24W /  75W |    206MiB /  8192MiB |      3%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     30773      C   /usr/lib/jellyfin-ffmpeg/ffmpeg             204MiB |
+---------------------------------------------------------------------------------------+


However, it's basically untested right now, and I can't tell you what would happen during an official upgrade to a hypothetical 23.10.1 that includes this new driver. This would definitely be a Here Be Dragons kind of setup.
 

mm0nst3r

Dabbler
Joined
Sep 5, 2021
Messages
33
Well, it didn't immediately set my system on fire:

Code:
admin@cobia01dev[~]$ nvidia-smi
Thu Oct 26 14:19:27 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       Off | 00000000:13:00.0 Off |                  Off |
| N/A   36C    P0              24W /  75W |    206MiB /  8192MiB |      3%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     30773      C   /usr/lib/jellyfin-ffmpeg/ffmpeg             204MiB |
+---------------------------------------------------------------------------------------+


However, it's basically untested right now, and I can't tell you what would happen during an official upgrade to a hypothetical 23.10.1 that includes this new driver. This would definitely be a Here Be Dragons kind of setup.

Isn't 535.54.03 the old version?

The latest one is 535.113.01

https://www.nvidia.com/Download/driverResults.aspx/211711/en-us/

Please see supported products section.
The important part (current GPU line) for consumer driver is:
NVIDIA RTX Series:
NVIDIA RTX 6000 Ada Generation, NVIDIA RTX 5000 Ada Generation, NVIDIA RTX 4500 Ada Generation, NVIDIA RTX 4000 Ada Generation, NVIDIA RTX 4000 SFF Ada Generation, NVIDIA RTX A6000, NVIDIA RTX A5500, NVIDIA RTX A5000, NVIDIA RTX A4500, NVIDIA RTX A4000H, NVIDIA RTX A4000, NVIDIA RTX A2000 12GB, NVIDIA RTX A2000, NVIDIA T1000 8GB, NVIDIA T1000, NVIDIA T600, NVIDIA T400 4GB, NVIDIA T400

However support for your P4 is dropped.
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Isn't 535.54.03 the old version?

The latest one is 535.113.01

Well, that's a bit of a miss. I'll have to give the upgrade another pass and see if it can go to the latest.

However support for your P4 is dropped.

That would surprise me, since the consumer line goes all the way back to the Maxwell series (GTX750) - I guess I'll have to try it.
 

mm0nst3r

Dabbler
Joined
Sep 5, 2021
Messages
33
Well, that's a bit of a miss. I'll have to give the upgrade another pass and see if it can go to the latest.



That would surprise me, since the consumer line goes all the way back to the Maxwell series (GTX750) - I guess I'll have to try it.

No they really only kept P4 in DC LTS driver:

But it doesn't support neither consumer GPUs nor PRO line.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
No they really only kept P4 in DC LTS driver:

But it doesn't support neither consumer GPUs nor PRO line.
NVIDIA might not list it as supported, but Honey Badger don't care. ;)

Code:
root@cobia01dev[/home/admin]# nvidia-smi
Thu Oct 26 20:26:17 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.113.01             Driver Version: 535.113.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       Off | 00000000:13:00.0 Off |                  Off |
| N/A   40C    P0              23W /  75W |    117MiB /  8192MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     13000      C   /usr/lib/jellyfin-ffmpeg/ffmpeg             115MiB |
+---------------------------------------------------------------------------------------+

A lot more manual modification was needed, but it does work. Upgrades will likely break things from here though, so if this is done now, once the driver is included in a future version, you'd want to export your pool, back up your config, and then clean-install the new version to get back to an "officially supported" state.
 

mm0nst3r

Dabbler
Joined
Sep 5, 2021
Messages
33
NVIDIA might not list it as supported, but Honey Badger don't care. ;)

Code:
root@cobia01dev[/home/admin]# nvidia-smi
Thu Oct 26 20:26:17 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.113.01             Driver Version: 535.113.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P4                       Off | 00000000:13:00.0 Off |                  Off |
| N/A   40C    P0              23W /  75W |    117MiB /  8192MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     13000      C   /usr/lib/jellyfin-ffmpeg/ffmpeg             115MiB |
+---------------------------------------------------------------------------------------+

A lot more manual modification was needed, but it does work. Upgrades will likely break things from here though, so if this is done now, once the driver is included in a future version, you'd want to export your pool, back up your config, and then clean-install the new version to get back to an "officially supported" state.
Any chance it will still make its way to .1 ?

Will there be nightly to test it?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Any chance it will still make its way to .1 ?

Will there be nightly to test it?
Can't speak to inclusion or a release date, but I've made sure to share my findings with our Engineering team. It appears to be a minor release in the same 535 family, and it doesn't sunset any old hardware so there's no worries about someone's old Maxwell GPU getting EOL'd. (Although on a personal level, I'm not sure how much longer NVIDIA will support that chip family.)

I imagine that the nightly images will contain the new driver if it's decided to be included.
 

Saoshen

Dabbler
Joined
Oct 13, 2023
Messages
47
thanks for everyone's efforts and investigations, as it also seemed to me, to be a rather strange situation to have a newer driver made available, but it still not support the newer ada hardware.
 

mm0nst3r

Dabbler
Joined
Sep 5, 2021
Messages
33
thanks for everyone's efforts and investigations, as it also seemed to me, to be a rather strange situation to have a newer driver made available, but it still not support the newer ada hardware.

It is not strange.
NVIDIA long time support drivers - are drivers to support the same hardware for a long time - not the conservative driver branch maintained for a long time. Meaning they support the lifecycle of their old hardware, not adding new features, not adding new hardware support. It's not meant to be used with new hardware at all. I don't know why IX picks Nvidia drivers from LTS branch.
 

bryan_v

Cadet
Joined
Mar 30, 2023
Messages
3
+1 on the driver update, or at least the ability to use the non-DC drivers. I'm specing out a hard rotation to TrueNAS Scale to underpin all our dev clusters because of the rock solid zfs/storage management, reasonably fair VM support, and the only storage/virtualisation distro that isn't outwardly aggressive to running our own docker/portainer/k8s stack on metal. Ampere and Ada support is really the major blocker right now, and every way I try to remove the stock nvidia drivers and roll the latest Nvidia drivers, borks the entire TrueNAS install when I remove the Nvidia drivers; the core `truenas` package for some reason depends on `nvidia-support` which must be removed prior to installation of new drivers.

I understand the instict to use the upstream Debian DC LTS drivers in the default shipped product, but having the `truenas` package depend on it is a little crazy when functionality is not required for any Truenas functionality, but instead to just provide out-of-box driver support for Nvidia hardware.

I'd even be open to having a TrueNAS `community` repo with packages that were community supported, for edge/non-core cases that don't make financial sense for iXs to support 100% internally.
 

csjjpm

Contributor
Joined
Feb 16, 2015
Messages
126
Hi, if someone doesn't mind advising but is the linux driver in Truenas Scale Cobia the consumer desktop or the datacenter server version?

I don't have a Nvidia card yet and want to try out a cheap ebay one for PoC.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Hi, if someone doesn't mind advising but is the linux driver in Truenas Scale Cobia the consumer desktop or the datacenter server version?

I don't have a Nvidia card yet and want to try out a cheap ebay one for PoC.
Hey @csjjpm

It's the standard Linux x86_64 driver - you should be able to use anything in the Supported Products list here:


I wouldn't go any further back than Pascal, and be mindful that the really low-end cards eg: GT1030 have no NVENC engine if you're hoping to use it for video transcoding.
 
Top