bcat
Explorer
- Joined
- Oct 20, 2022
- Messages
- 84
I have a NAS built on an ASRock Rock E3C246D4M-4L (full specs in my signature) with a Xeon E-2276G processor. I'm running the latest stable build of Cobia (23.10.0.1). This configuration leaves me with two GPUs: one from the AST2500 BMC and the other from the processor's iGPU.
The TrueNAS console displays on the BMC's GPU, and the VGA port on the motherboard mirrors that output. The iGPU itself has no video outputs. It's also in its own IOMMU group (Group 0) as verified via this script:
The iGPU works fine for hardware transcoding on the TrueNAS host (e.g., with the Plex app), but now I'd like to pass it through to a VM (as I'm trying to migrate away from SCALE's k3s apps to a setup that's easier for me to maintain). As far as I can tell, this should work swimmingly, since the iGPU is the only device in its IOMMU group (which should be the ideal case).
However, when I try to "isolate" the iGPU in SCALE's advanced settings, I get this error: "0000:00:02.0 GPU pci slot(s) consists of devices which cannot be isolated from host."
Unfortunately, SCALE doesn't tell me what devices it thinks cannot be isolated, so this is quite hard to debug further. Any tips would be appreciated! (I'll likely file a bug in the Jira in a bit, but it's quite possible I'm making some silly mistake on my end, so I wanted to check first.)
Edit: Weirdly, the code in the middleware that ultimately results in this error being thrown doesn't seem to look at IOMMU groups at all! I'm admittedly quite new to hardware passthrough use cases, but that doesn't seem right to me. It seems to be disallowing isolation of GPU devices with certain PCIe siblings... but without regard to what IOMMU groups they're actually in. Maybe that's just a bug?
The TrueNAS console displays on the BMC's GPU, and the VGA port on the motherboard mirrors that output. The iGPU itself has no video outputs. It's also in its own IOMMU group (Group 0) as verified via this script:
Code:
IOMMU Group 0: 00:02.0 Display controller [0380]: Intel Corporation CoffeeLake-S GT2 [UHD Graphics P630] [8086:3e96] IOMMU Group 1: 00:00.0 Host bridge [0600]: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers [8086:3ec6] (rev 07) IOMMU Group 2: 00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07) 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 07) 00:01.2 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x4) [8086:1909] (rev 07) 01:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation PS5013 E13 NVMe Controller [1987:5013] (rev 01) 02:00.0 Non-Volatile memory controller [0108]: Seagate Technology PLC FireCuda 510 SSD [1bb1:5012] (rev 01) 03:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808] IOMMU Group 3: 00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911] IOMMU Group 4: 00:12.0 Signal processing controller [1180]: Intel Corporation Cannon Lake PCH Thermal Controller [8086:a379] (rev 10) IOMMU Group 5: 00:14.0 USB controller [0c03]: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller [8086:a36d] (rev 10) 00:14.2 RAM memory [0500]: Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f] (rev 10) IOMMU Group 6: 00:15.0 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 [8086:a368] (rev 10) 00:15.1 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1 [8086:a369] (rev 10) IOMMU Group 7: 00:16.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH HECI Controller [8086:a360] (rev 10) 00:16.1 Communication controller [0780]: Intel Corporation Device [8086:a361] (rev 10) 00:16.4 Communication controller [0780]: Intel Corporation Cannon Lake PCH HECI Controller #2 [8086:a364] (rev 10) IOMMU Group 8: 00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10) IOMMU Group 9: 00:1b.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #17 [8086:a340] (rev f0) IOMMU Group 10: 00:1b.4 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 [8086:a32c] (rev f0) IOMMU Group 11: 00:1c.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #1 [8086:a338] (rev f0) IOMMU Group 12: 00:1d.0 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 [8086:a330] (rev f0) IOMMU Group 13: 00:1d.1 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #10 [8086:a331] (rev f0) IOMMU Group 14: 00:1d.2 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #11 [8086:a332] (rev f0) IOMMU Group 15: 00:1d.3 PCI bridge [0604]: Intel Corporation Cannon Lake PCH PCI Express Root Port #12 [8086:a333] (rev f0) IOMMU Group 16: 00:1e.0 Communication controller [0780]: Intel Corporation Cannon Lake PCH Serial IO UART Host Controller [8086:a328] (rev 10) IOMMU Group 17: 00:1f.0 ISA bridge [0601]: Intel Corporation Cannon Point-LP LPC Controller [8086:a309] (rev 10) 00:1f.4 SMBus [0c05]: Intel Corporation Cannon Lake PCH SMBus Controller [8086:a323] (rev 10) 00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller [8086:a324] (rev 10) 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (7) I219-LM [8086:15bb] (rev 10) IOMMU Group 18: 05:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] IOMMU Group 19: 07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 20: 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 21: 09:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 22: 0a:00.0 PCI bridge [0604]: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge [1a03:1150] (rev 04) 0b:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 41)
The iGPU works fine for hardware transcoding on the TrueNAS host (e.g., with the Plex app), but now I'd like to pass it through to a VM (as I'm trying to migrate away from SCALE's k3s apps to a setup that's easier for me to maintain). As far as I can tell, this should work swimmingly, since the iGPU is the only device in its IOMMU group (which should be the ideal case).
However, when I try to "isolate" the iGPU in SCALE's advanced settings, I get this error: "0000:00:02.0 GPU pci slot(s) consists of devices which cannot be isolated from host."
Unfortunately, SCALE doesn't tell me what devices it thinks cannot be isolated, so this is quite hard to debug further. Any tips would be appreciated! (I'll likely file a bug in the Jira in a bit, but it's quite possible I'm making some silly mistake on my end, so I wanted to check first.)
Edit: Weirdly, the code in the middleware that ultimately results in this error being thrown doesn't seem to look at IOMMU groups at all! I'm admittedly quite new to hardware passthrough use cases, but that doesn't seem right to me. It seems to be disallowing isolation of GPU devices with certain PCIe siblings... but without regard to what IOMMU groups they're actually in. Maybe that's just a bug?
Last edited: