Scale requires GPU for host ... request: allow single GPU for passthrough

Status
Not open for further replies.

jahf

Cadet
Joined
Apr 8, 2022
Messages
3
Caveat: I'm new to these parts, investigating a switch from Proxmox to Scale (from Unraid, then a stock Debian build for awhile).

I want to isolate my GPU for use with VMs, but when I do I the UI tells me "At least 1 GPU is required by the host for it's functions. With your selection, no GPU is available for the host to consume."

However removing the GPU shows that the system is perfectly fine running headless.

I see how there's significant advantages to having a fallback output in case of network failure, but, a headless system wouldn't have this anyway.

I see how this might not be something TrueNAS has supported in the past as it's more of a VFIO setup than a traditional NAS setup.

But ... if I could do this, TrueNAS Scale really works for my converged system.

So 2 parts:

1) Is there a decent way to workaround this without breaking the concept of not trying to do much to the host OS? If no, I'm likely locked out of Scale for my uses and that would make me sad as I really like the rest of what I'm seeing.

2) Feature request for a future release: support this in the UI.

Brainstorming a bit further purely for the feature request:

In an ideal environment we might have an option to set a timeout on the boot menu that, if not interrupted, will boot with GPU on the host so that we can recover (systems with IPMI or other OOB management wouldn't need this, so it would be nice to have a config setup for the boot menu to switch the default behavior). Also in an ideal environment we'd be able to attach/detach the GPU from the host on demand (for instance, to have some part-time services via containers that use the GPU that can be turned off to then attach to a VM).

But even without those 2 ideal options, since the server runs fine headless, we should have the option to force headless mode even when there is a GPU present.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
At least 1 GPU is required by the host for it's functions. With your selection, no GPU is available for the host to consume.
That seems bizarre. It's probably fair to say that most servers around here have zero GPUs, with only basic video adapters.
 

jahf

Cadet
Joined
Apr 8, 2022
Messages
3
I gave it a shot on unbinding the GPU from the host to see if the UI would let me isolate the single GPU.

However while detaching was successful in freeing the GPU via `driverctl` (which, yes, I installed using `apt` from the main Debian repo and so immediately went off the official TNS image), the UI still throws the errors when trying to use the GPU for a VM. Which, honestly, I expected. But it was worth a shot.

I wrote everything I looked up on a github file with more details on what I was hoping to see. If someone from iX reviews this, the info is here:

 

jahf

Cadet
Joined
Apr 8, 2022
Messages
3
I did manage to hack the TrueNAS Scale UI to allow single GPU to be isolated from the host and attached to a VM.

Modified the github page in my prior comment with detailed information on where I'm at.

At this point I have a VM with the GPU attached, driver installed, and no device manager errors. However I have blackscreen instead of video output if I keep VNC Display attached to the VM and no success booting when the VNC display is removed. I think this is going to come down to needing a romfile added.

Which means CLI edits to the VM config, at which point I'm asking myself how much I'm really helping myself using TNS if I've needed to hack the UI and then do shell edits on files to get everything working.

This isn't meant as a negative note about TNS, it's young, more just pointing out the areas where improvements might help.

If I get the video output working I'll post back, but if I don't it may mean I've moved the system to a different OS for now.
 

Migsi

Dabbler
Joined
Mar 3, 2021
Messages
40

raudraido

Cadet
Joined
May 25, 2021
Messages
8
I have tested scale from alpha stage, and in one point I had headless truenas running without any tweaking. It was quite a negative surprise that it was blocked in the UI later stage
 

mikegleasonjr

Dabbler
Joined
Sep 14, 2020
Messages
10
I'd like to upgrade mu CPU with one that doesn't have an APU. I don't want to use my discreet GPU just to be sitting there for TrueNAS primary display while I have no monitors attached.

Can it be considered please?
 

raudraido

Cadet
Joined
May 25, 2021
Messages
8
I have tested scale from alpha stage, and in one point I had headless truenas running without any tweaking. It was quite a negative surprise that it was blocked in the UI
I'd like to upgrade mu CPU with one that doesn't have an APU. I don't want to use my discreet GPU just to be sitting there for TrueNAS primary display while I have no monitors attached.

Can it be considered please?
As I remember, scale works fine without any gpu, but if you add one, then it is system reserved and can not be used by vm-s. Quite silly
 

Deeda

Explorer
Joined
Feb 16, 2021
Messages
65
Just hit the same issue, even though I have an onboard ASPEED 2400 Integrated IPMI, that was seen by TrueNAS Scale as a GPU option before I installed a GeForce 1030. Now that the 1030 card is in there, only one GPU is seen, and it won't let me pass it through to the VM.
 

radomirpolach

Explorer
Joined
Feb 13, 2022
Messages
71
That seems bizarre. It's probably fair to say that most servers around here have zero GPUs, with only basic video adapters.
It is even more bizzare for me, I actually have two GPUs. Ingreated AMD and dedicated nVidia and I can't pass nVidia to my VM.

IOMMU Group 0:
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1633]
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:24b0] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)
IOMMU Group 1:
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1634]
00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1634]
02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ee]
02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43eb]
02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43e9]
03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
03:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43ea]
04:00.0 Non-Volatile memory controller [0108]: Intel Corporation SSD Pro 7600p/760p/E 6100p Series [8086:f1a6] (rev 03)
05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
06:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology P1 NVMe PCIe SSD [c0a9:2263] (rev 03)
IOMMU Group 2:
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d9)
07:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
07:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
07:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
07:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
IOMMU Group 3:
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 51)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 4:
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 0 [1022:1448]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 1 [1022:1449]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 2 [1022:144a]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 3 [1022:144b]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 4 [1022:144c]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 5 [1022:144d]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 6 [1022:144e]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 7 [1022:144f]
I have the same error message.
 

Deeda

Explorer
Joined
Feb 16, 2021
Messages
65
Disable the Nvidia card (PCIE) as your primary card in your BIOS if you can. That fixed it for me.
 

Deeda

Explorer
Joined
Feb 16, 2021
Messages
65
It is a shame. I do notice, that a few so called "gurus" on the TrueNAS forums, launch into lectures or condescension rather then helping people.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Yeah, don't be surprised if people tell you to leave if your opening line is to insult them.

<Moderator note>
I will take this opportunity to remind everyone of the forum rules, as well as common courtesy. The matter was resolved on Github and we expect all users to adhere to the forum rules.
Constructive criticism is good, insults are not.

Please refrain from making vague, hand-wavy accusations of improper conduct, such as the preceding post. If you consider a certain post to be inappropriate, use the report button.
</Moderator note>
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It is a shame. I do notice, that a few so called "gurus" on the TrueNAS forums, launch into lectures or condescension rather then helping people.

A "lecture" is defined by Oxford English to be "an educational talk to an audience, especially to students in a university or college."

The forum membership is certainly an audience. Some of us do indeed attempt to provide lecture-grade briefs on the complex topics that often confuse ZFS newcomers. If you do not want in-depth discussions of your problems, you are free to not post them. Educational discussions aimed at being useful to community members searching for solutions are deemed to be on-topic and appropriate to these forums.

Please be aware that "condescension" is often in the eye of the beholder, and also be aware that many participants here may not be native English speakers. It would be prudent to apply Postel's Law to communications on this forum, "be conservative in what you do, be liberal in what you accept from others". This loosely translates to "try not to find a way to take offense at what someone says, if there's a chance that no offense was meant."

This forum is the result of what everyone makes of it. Be excellent, friendly, helpful, and courteous. It will make for a better forum experience.
 
Status
Not open for further replies.
Top