SOLVED 13th generation iGPU passthrough

deniax

Cadet
Joined
Mar 5, 2023
Messages
8
Hi!

Whilst I know the 13th generation Intel CPU is very very new, it shares the same iGPU driver as Alder Lake, and as Truenas has support for the Alder Lake driver ( https://www.truenas.com/docs/scale/scale22.12/ ) , I can't get it to work on Raptor Lake iGPU

I did as was in the docs:
Code:
midclt call system.advanced.update '{"kernel_extra_options": "i915.force_probe=a780" }'


Where a780 is the output of:
Code:
root@truenas[~]# lspci | grep -i vga
00:02.0 VGA compatible controller: Intel Corporation Device a780 (rev 04)


But unforunately no drivers are loaded: (Yes a reboot was done :) )
Code:
ls -alh /dev/dri            
total 0
drwxr-xr-x  2 root root   40 Mar 12 08:24 .
drwxr-xr-x 21 root root 4.5K Mar 12 08:24 ..


Am I overlooking something?
 
Last edited:

MrYoshii

Cadet
Joined
Mar 19, 2023
Messages
9
Hi i got the same issue.. did you find something?
 

crownrai

Dabbler
Joined
Mar 12, 2023
Messages
11
I have the same issue (Intel i3-13100 with Intel B760 chipset). VGA/Display device is listed as "4692 (rev 0c)"

I tried the the extra kernel driver steps mentioned in this post for 12th gen CPU's, but no luck.

It also caused a problem where my console screen would stop working partway through the boot up process, but the server would finish loading all services just fine. Another issue with these changes was I could not reboot the server, it would pause during the shutdown sequence. I could still ping it, but I could not connect to any other services, I had to perform a forced power off.

I reverted all changes one by one and after removing the i915.force_probe change, the console/shutdown issues went away.

I hoping the next release with a kernel update to 6+ will fix the issue.
 

deniax

Cadet
Joined
Mar 5, 2023
Messages
8
The only way I got it "working" is not to run TrueNAS on metal.
I installed Proxmox, updated it the kernel to 6.2, installed TrueNAS as a VM and ran LXC containers that worked almost out of the box with the 13th gen intel iGPU. All is good now/

Note that using TrueNAS this way is unsupported
 

hrayrwannis

Dabbler
Joined
Mar 18, 2023
Messages
11
same issue with i7-13700K - can't get the iGpu show up for apps:

admin@truenas[~]$ lspci | grep -i vga
00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-S GT1 [UHD Graphics 770] (rev 04)

admin@truenas[~]$ sudo dmesg | grep i915
[0.000000] Command line: BOOT_IMAGE=/ROOT/22.12.1@/boot/vmlinuz-5.15.79+truenas root=ZFS=boot-pool/ROOT/22.12.1 ro libata.allow_tpm=1 amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 intel_iommu=on zfsforce=1 nvme_core.multipath=N i915.force_probe=4690
[0.015639] Kernel command line: BOOT_IMAGE=/ROOT/22.12.1@/boot/vmlinuz-5.15.79+truenas root=ZFS=boot-pool/ROOT/22.12.1 ro libata.allow_tpm=1 amd_iommu=on iommu=pt kvm_amd.npt=1 kvm_amd.avic=1 intel_iommu=on zfsforce=1 nvme_core.multipath=N i915.force_probe=4690

admin@truenas[~]$ midclt call system.advanced.update '{"kernel_extra_options": "i915.force_probe=4690"}'
{"id": 1, "consolemenu": true, "serialconsole": false, "serialport": "ttyS0", "serialspeed": "9600", "powerdaemon": false, "swapondrive": 2, "overprovision": null, "traceback": true, "advancedmode": false, "autotune": false, "debugkernel": false, "uploadcrash": true, "anonstats": true, "anonstats_token": "", "motd": "Welcome to TrueNAS", "boot_scrub": 7, "fqdn_syslog": false, "sed_user": "USER", "sysloglevel": "F_INFO", "syslogserver": "", "syslog_transport": "UDP", "kdump_enabled": false, "isolated_gpu_pci_ids": [], "kernel_extra_options": "i915.force_probe=4690", "syslog_tls_certificate": null, "syslog_tls_certificate_authority": null, "consolemsg": true}

admin@truenas[~]$ ls -alh /dev/dri
total 0
drwxr-xr-x 2 root root 40 Mar 24 14:47 .
drwxr-xr-x 20 root root 3.7K Mar 24 14:47 ..



Any ideas?
 

deniax

Cadet
Joined
Mar 5, 2023
Messages
8
Any updates?
For this to work natively, you probably are going to wait several months to a year until TrueNAS will update Linux that has native support for this.
The only way to get it to work is the non-supported way I describe here
 
Joined
May 1, 2023
Messages
2
Whilst I know the 13th generation Intel CPU is very very new, it shares the same iGPU driver as Alder Lake, and as Truenas has support for the Alder Lake driver ( https://www.truenas.com/docs/scale/scale22.12/ ) , I can't get it to work on Raptor Lake iGPU

I did as was in the docs:
Code:
midclt call system.advanced.update '{"kernel_extra_options": "i915.force_probe=a780" }'
Just adding, in case anyone is as confused as I, the mentioned section has been removed from the documentation / release notes for 22.12. Seemingly with this commit, and seemingly without explanation. So, not supported at all?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Generally it is a bad idea to run FreeBSD or Linux on the latest generation CPU's. It often takes a few months for underlying OS support to be written, debugged, and settle on the upstream operating systems, then it takes several more months for an official release of that operating system, at which point the TrueNAS team may take that and start using it as a base for TrueNAS, which then takes a few more months. You should not expect this to work until actual support for a CPU generation is known to exist. Past experience is that it can take about a year from the time a CPU generation is released.

Is there any reason you need 13th generation CPU support rather than 12th? Other than that you already bought the 13th gen CPU?
 
Joined
May 1, 2023
Messages
2
Is there any reason you need 13th generation CPU support rather than 12th? Other than that you already bought the 13th gen CPU?
For me, no - there's no other reason than price-to-performance for my local prices favoured 13th gen. I was hopeful that 13th gen could piggyback off 12th gen due to being so similar. But I completely understand that it is a risky timeframe and a bad idea, no argument here.

But the release notes only ever mentioned Alder Lake - 12th gen - and the mention is now gone. So is this unsupported even for 12th gen (released in nov. 2021)?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm not a developer and I haven't followed this, especially since I neither run bare metal nor SCALE. However, if you are interested, most of the work on TrueNAS is based on Jira, and you may be able to find a ticket in Jira that would give you insights as to how this happened. You could also try reporting it as a bug to see if you can taunt a developer into cluing you in. :smile:
 

Alex1

Cadet
Joined
Jun 1, 2023
Messages
1
Hello,

I managed to get it to work with a somewhat hacky solution using the alpha version of Cobia and decided to share for anyone that is interested and wants to use it before the official release.

The Cobia nightly builds have recently been updated to use Linux kernel 6.1 which has support for 13th gen and you should be able to use it for apps (though I haven't tested that because I want to use it in a VM). You can find nightly builds here - TrueNAS Scale Cobia nightly builds.

I have tried some from around 25 May, but they had issues with creating the initial user for Web GUI login, so I began trying at random and used the first one which was working for me, which is TrueNAS-SCALE-23.10-MASTER-20230509-024310.iso.

From there on you need to make potentially 2 or 3 "fixes" depending on your situation. I have an ASUS W680 IPMI, so the IPMI was registering as a second GPU and I didn't have the errors for needing at least 1 GPU for the system (there is another forum thread for those errors which should contain relevant information if that is your case, but I won't discuss it, since I haven't used it).

First you need to edit the python middleware that TrueNAS uses to allow for the iGPU to be isolatable. It seems that there is a check for GPUs and other PCI devices if they are consuming critical resources and if they are, they aren't allowed for isolation, but the check itself doesn't seem very logical to me. It checks if there are surrounding PCI devices (children of the parent of the validated device) that has certain keywords inside their names, but in fact the PCI device could be in a separate IOMMU group (and not contain the keywords in its name) which would allow it to be used and seems like a more reasonable check. I didn't have time to go through all the code to make the correct check, so I just made it to mark all devices as not system critical (since the check doesn't seem to be used in many places and shouldn't cause problems if used responsibly).

Generally the commands in order are - edit, delete python cache, restart the middlewared service:

sudo nano /usr/lib/python3/dist-packages/middlewared/utils/gpu.py sudo rm /usr/lib/python3/dist-packages/middlewared/utils/__pycache__/gpu.cpython-311.pyc sudo service middlewared restart

In the gpu,py file around line 60 after the for loop you should add:
critical = False
which makes the device not critical and allows for isolation (or you could comment out the for loop, since the variable is initialized as False)

The second "fix" is needed because the code for generating available PCI devices for a VM seems to have been changed and is in a broken state, because one of the keys in the map is a Generator instead of a bool and Python cannot make it a JSON response for the frontend.

sudo nano /usr/lib/python3/dist-packages/middlewared/plugins/vm/pci.py sudo rm /usr/lib/python3/dist-packages/middlewared/plugins/vm/__pycache__/pci.cpython-311.pyc sudo service middlewared restart

Around line 100 we want to add the word 'any' so that the Generator is evaluated to a bool:
Before:
'critical': (k.lower() in controller_type.lower() for k in SENSITIVE_PCI_DEVICE_TYPES),
After:
'critical': any(k.lower() in controller_type.lower() for k in SENSITIVE_PCI_DEVICE_TYPES),

After that and possibly a restart of the server you should be able to isolate the iGPU and create a VM, but don't add it as a GPU to it. Add it as a PCI device instead and everything should be working.

I have tested this with an Ubuntu 22 image and installed Jellyfin on it to validate that QuickSync is working for transcoding.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Is there any reason you need 13th generation CPU support rather than 12th? Other than that you already bought the 13th gen CPU?

Yea... for me I just noticed this same issue on my i5-13400 and my reasoning for 13th gen is because I already bought it... :frown: Crap... That's a big problem...
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
So... I decided to use one of my breaks at work to do this today and I can confirm that I have HW transcoding again!!

...

However...

I can no longer manage my datasets from the GUI. They all still show via SMB and from using
Code:
zfs list -r
, but they aren't in the GUI.

1692642475003.png
1692642479168.png
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Just updating here in case anyone else has the same issue I posted about a couple days ago.

To fix the dataset issue (for me at least) I had to remove my cloud sync task and then they showed right up (I first saw that nugget of info here). Hopefully that helps someone.
 

help!

Explorer
Joined
Aug 3, 2023
Messages
57
Do you not fin this beta to be unstable, ive got the same chip as you if its the 10 core, uses 45 watts from the wall so im looking to reduce that soon, however, i went to cobia mate and vps broke, snapshots didnt work, rsync broke, no backups, couldn't set region, and replication tasks would not run . so im on stable again ,

i upgraded my zfs pools, im not putting in the command because it bricked my install, clearly i had no idea what i was doing , but it prevented me downgrading, I had to copy to antoher TN system over smb and then back to wiped stable TNS , so be warned and dont upgrade your pools to get features unless you truly know what your doing
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Do you not fin this beta to be unstable

I haven't had any other issues (that I know of at least). I've been stable other than the datasets not showing. For me though the cloud sync is important as it's my off site backup, so I'm just leaving it as is until it's fixed. If I ever need to modify anything with the datasets before that happens then I'll delete the sync, do what I need to do, then re-add the sync.
 
Top