How to make GPU allocatable

odoyle

Explorer
Joined
Sep 2, 2014
Messages
62
Hi,
I'm having a similar issue here, but I'm on TrueNAS-SCALE-22.02.2 so the bug above should be fixed right?
I can see the GPU in SCALE, but try to allocate to app (in this case not plex but frigate in a docker compose), and I get:

Code:
Error: failed to start container "frigate-gpu-docker-compose": Error response from daemon: OCI runtime create
 failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused:
Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: ldcache error: process
 /sbin/ldconfig failed with error code: 2: unknown



1656105687093.png



I'm passing these into the app through the GUI:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=all

Here is the GPU in SCALE:

Code:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         Off  | 00000000:01:00.0 Off |                  N/A |
| 34%   36C    P8    N/A /  N/A |      0MiB /  2000MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                              
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


I also tried running this via ssh but didn't help
Code:
modprobe nvidia-current-uvm && /usr/bin/nvidia-modprobe -c0 -u
 

Saberwolf

Explorer
Joined
Feb 7, 2021
Messages
63
nvidia-container-cli: ldcache error: process /sbin/ldconfig failed with error code: 2: unknown is an issue with LINK LIBRAYS go to the ssh term in the web browser > system settings > SHELL and type > modprobe nvidia-current-uvm && /usr/bin/nvidia-modprobe -c0 -u check that you have the dir in truenas scale

Code:
root@truenas:~# ls /dev/nvidia*

/dev/nvidia-modeset  /dev/nvidia-uvm  /dev/nvidia-uvm-tools  /dev/nvidia0  /dev/nvidiactl



/dev/nvidia-caps:

nvidia-cap1  nvidia-cap2

root@truenas:~# 


make shure
NVIDIA_VISIBLE_DEVICES=ALL
NVIDIA_DRIVER_CAPABILITIES=ALL
are defined in the environment variables

start the app and get to shell of app and check that you can run nvidia-smi and do a list of the above dirs look about the same dont worrie if your missing nvidia-cap's dir they are not needed for hardware acceleration

oh and if you have a gpu allocated to another app you will get the above issue
 

odoyle

Explorer
Joined
Sep 2, 2014
Messages
62
I ran that command before I posted and verified those dirs exist.. it is something with the container start because it never even finishes deploying (so I can't enter the shell of the app to check anything)
 

Saberwolf

Explorer
Joined
Feb 7, 2021
Messages
63
what video card are you using
 

odoyle

Explorer
Joined
Sep 2, 2014
Messages
62

Saberwolf

Explorer
Joined
Feb 7, 2021
Messages
63
ok the way you added the app is in an unsupported way your using the program called frigate docker pull blakeblackshear/frigate as a preconfigured docker image and not a compose add it to your apps buy clicking the button lunch docker image top right of apps


OK nm the IDOTS that made Frigate are stupid and doing things back asswards NVIDAS official supported documentation states the

NVIDIA_DRIVER_CAPABILITIES​

This option controls which driver libraries/binaries will be mounted inside the container.

Possible values​

  • compute,video, graphics,utility …: a comma-separated list of driver features the container needs.
  • all: enable all available driver capabilities.
  • empty or unset: use default driver capability: utility,compute.

Supported driver capabilities​

  • compute: required for CUDA and OpenCL applications.
  • compat32: required for running 32-bit applications.
  • graphics: required for running OpenGL and Vulkan applications.
  • utility: required for using nvidia-smi and NVML.
  • video: required for using the Video Codec SDK.
  • display: required for leveraging X11 display.
above info was pulled from https://github.com/NVIDIA/nvidia-container-runtime#nvidia_driver_capabilities

if i pass the All in the capabilities section it flips out and complains that it is not allowed just use the default witch is empty or unset witch is utility,compute

dont know why you want to install portainer when you can just lunch a precompiled docker image from docker.io like i did and it was up and running in less time using the scale ui below are my app logs and what it was complaing about and how i fixed it

Simple fix is to just undfine env variables dont put any thing in unless you need it and you fine and up and running


2022-06-27 8:01:33 Started container ix-chart 2022-06-27 8:01:33 Created container ix-chart 2022-06-27 8:01:31 Container image "blakeblackshear/frigate:0.11.0-1d45b0b" already present on machine 2022-06-27 8:01:31 Add eth0 [172.15.0.28/16] from ix-net Successfully assigned ix-frigate/frigate-ix-chart-74464d8cbf-z5kht to ix-truenas 2022-06-27 8:01:28 Created pod: frigate-ix-chart-74464d8cbf-z5kht 2022-06-27 8:01:28 Scaled up replica set frigate-ix-chart-74464d8cbf to 1 2022-06-27 8:01:11 Deleted pod: frigate-ix-chart-5b489cc446-5ntgb 2022-06-27 8:01:11 Scaled down replica set frigate-ix-chart-5b489cc446 to 0 2022-06-27 8:00:58 Error: failed to start container "ix-chart": Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: unsupported capabilities found in 'ALL' (allowed ''): unknown 2022-06-27 8:00:58 Created container ix-chart 2022-06-27 8:00:55 Container image "blakeblackshear/frigate:0.11.0-1d45b0b" already present on machine 2022-06-27 8:00:55 Add eth0 [172.15.0.27/16] from ix-net Successfully assigned ix-frigate/frigate-ix-chart-5b489cc446-5ntgb to ix-truenas 2022-06-27 8:00:52 Created pod: frigate-ix-chart-5b489cc446-5ntgb 2022-06-27 8:00:52 Scaled up replica set frigate-ix-chart-5b489cc446 to 1 2022-06-27 7:59:42 Deleted pod: frigate-ix-chart-586b6d9d8d-26khm 2022-06-27 7:59:42 Scaled down replica set frigate-ix-chart-586b6d9d8d to 0 2022-06-27 7:56:35 Back-off restarting failed container 2022-06-27 7:56:14 Error: failed to start container "ix-chart": Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: unsupported capabilities found in 'ALL' (allowed ''): unknown 2022-06-27 7:56:13 Created container ix-chart 2022-06-27 7:56:14 Container image "blakeblackshear/frigate:0.11.0-1d45b0b" already present on machine 2022-06-27 7:56:11 Successfully pulled image "blakeblackshear/frigate:0.11.0-1d45b0b" in 30.913119097s 2022-06-27 7:55:40 Pulling image "blakeblackshear/frigate:0.11.0-1d45b0b" 2022-06-27 7:55:40 Add eth0 [172.15.0.26/16] from ix-net Successfully assigned ix-frigate/frigate-ix-chart-586b6d9d8d-26khm to ix-truenas 2022-06-27 7:55:37 Created pod: frigate-ix-chart-586b6d9d8d-26khm 2022-06-27 7:55:37 Scaled up replica set frigate-ix-chart-586b6d9d8d to 1 2022-06-27 7:55:18 Deleted pod: frigate-ix-chart-766b4b897-pqd4c 2022-06-27 7:55:18 Scaled down replica set frigate-ix-chart-766b4b897 to 0 2022-06-27 7:54:20 Error: ErrImagePull 2022-06-27 7:54:20 Failed to pull image "blakeblackshear/frigate:latest": rpc error: code = Unknown desc = Error response from daemon: manifest for blakeblackshear/frigate:latest not found: manifest unknown: manifest unknown 2022-06-27 7:54:18 Pulling image "blakeblackshear/frigate:latest" 2022-06-27 7:54:23 Error: ImagePullBackOff 2022-06-27 7:54:23 Back-off pulling image "blakeblackshear/frigate:latest" 2022-06-27 7:54:23 Add eth0 [172.15.0.25/16] from ix-net 2022-06-27 7:54:20 Pod sandbox changed, it will be killed and re-created. 2022-06-27 7:54:18 Add eth0 [172.15.0.24/16] from ix-net Successfully assigned ix-frigate/frigate-ix-chart-766b4b897-pqd4c to ix-truenas 2022-06-27 7:54:14 Created pod: frigate-ix-chart-766b4b897-pqd4c 2022-06-27 7:54:14 Scaled up replica set frigate-ix-chart-766b4b897 to 1
 

Saberwolf

Explorer
Joined
Feb 7, 2021
Messages
63
You may need to do some more digging by asking the makers of Frigate what they support and give them that list and see if they can help any further but for me it works you might have to add an env variable to get some things to work but as they are doing things there way it is an unknown at this time OH and DO NOT ADD THE DIR your self they wont work the are linklib folders for apps it sounds like it should work but it will not unless the ldconfig program does it
 

Saberwolf

Explorer
Joined
Feb 7, 2021
Messages
63
any news if it work I am pretty Shure it did.
 

odoyle

Explorer
Joined
Sep 2, 2014
Messages
62
Sorry I was traveling and wasn't able to test until today.
Having trouble translating my compose settings to docker container settings (You were using the blue "launch docker image" right?)
Try using this image: blakeblackshear/frigate:0.11.0-beta4. I noticed you were getting an "image not found" for the "latest" tag..
I don't know where to put several key things I would need for a basic frigate setup (nevermind GPU settings), for example:
- how to specify 1935 port?
- Where to specify other params such as `--mount type=tmpfs,target=/tmp/cache,tmpfs-size=1000000000` and `--shm-size=64m`

I tried in both the container entrypoint CMD field, and the arg field, but got: `no such file or directory: unknown`
 

odoyle

Explorer
Joined
Sep 2, 2014
Messages
62
Ok, I figured this out, I didn't need to set any container parameters.. but I did need to use "privileged mode"

EDIT: since that docker GUI doesn't allow any container args to be passed, this won't work because runtime: nvidia would need to be passed :(
 
Last edited:

Saberwolf

Explorer
Joined
Feb 7, 2021
Messages
63
they do allow container args
they are called enviorment vars
 

stavros-k

Patron
Joined
Dec 26, 2020
Messages
231
they do allow container args
they are called enviorment vars

Environment Variables are NOT the same as args.

Ok, I figured this out, I didn't need to set any container parameters.. but I did need to use "privileged mode"

EDIT: since that docker GUI doesn't allow any container args to be passed, this won't work because runtime: nvidia would need to be passed :(

You can use from TrueCharts, "custom-app" app, which lets you pick your GPU from a dropdown, the runtime, environment variables for nvidia, etc are handled automatically.

And for future reference, "custom-app" supports, args and commands override.
 

odoyle

Explorer
Joined
Sep 2, 2014
Messages
62
Thanks, I didn't know about the custom app option, much better than the iX default :)
Apologizes if there is documentation somewhere, but where would I add the equivalent of part of a container command, such as `--shm-size=256m`? Would it be under "Configure Command" or "Configure Extra Args"?
 
Top