Hi,
I'm running TrueNAS SCALE 22.12.0 on a SuperMicro X10DRU board (with latest BIOS and E5-2690v3 CPU). This seems to work fine in daily use. I have a couple of VM's on there. I enabled 'PCI Passthrough' for one of the VM's, passing through (as seen from lspci):
This works fine - the virtual machine when booting (Linux) can see the nvme, and can access it. Everything works great for a few days - then on TrueNAS SCALE I see a flurry of:
Dumped to the console / logged in session.
In the virtual machine I get a reciprocal:
The virtual machine at this point - looses all access to the nvme. If I 'force stop' it - I can see the nvme being returned to TrueNAS:
And when I re-start the VM - I can see presumably it being taken away again:
And the VM starts, and can access the nvme again. Until - the cycle repeats.
Any ideas?
I'm running TrueNAS SCALE 22.12.0 on a SuperMicro X10DRU board (with latest BIOS and E5-2690v3 CPU). This seems to work fine in daily use. I have a couple of VM's on there. I enabled 'PCI Passthrough' for one of the VM's, passing through (as seen from lspci):
03:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN750 / PC SN730 NVMe SSD
This works fine - the virtual machine when booting (Linux) can see the nvme, and can access it. Everything works great for a few days - then on TrueNAS SCALE I see a flurry of:
412820.847944] __common_interrupt: 6.34 No irq handler for vector
[412827.830043] __common_interrupt: 6.34 No irq handler for vector
Dumped to the console / logged in session.
In the virtual machine I get a reciprocal:
406246.844310] nvme nvme0: I/O 703 QID 3 timeout, completion polled
[406246.844332] nvme nvme0: I/O 704 QID 3 timeout, completion polled
[406260.922879] nvme nvme0: I/O 705 QID 3 timeout, completion polled
The virtual machine at this point - looses all access to the nvme. If I 'force stop' it - I can see the nvme being returned to TrueNAS:
[414221.201968] nvme nvme0: pci function 0000:03:00.0
[414221.237193] nvme nvme0: 24/0/0 default/read/poll queues
[414221.253739] nvme0n1: p1
And when I re-start the VM - I can see presumably it being taken away again:
[414319.946571] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x300
[414319.953971] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e@0x900
And the VM starts, and can access the nvme again. Until - the cycle repeats.
Any ideas?