NVMe Optane problem and possible solution

suhl

Cadet
Joined
Sep 8, 2020
Messages
4
After installing Optane P4801X nvme drives in the passthrough mode on VMware 7.0 and with TrueNAS Core 12 the system doesn't start resulting with numerous "nvme0: Missing Interrupt" error messages. Apparently it's a known problem with ena driver that has been addressed and recently solved in version 12.1 of FreeBSD. It's described here:
Is there a chance that this suggested patch will be applied in the next TrueNAS release?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Please file a bug report bringing that ticket to the devs' attention, just in case. A few other users have reported this in the past.
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
The described problem of ena(4) is specific to it, same as the patch. Some other NIC drivers were known to have the same problem and many were fixed in earlier versions. Coming soon TrueNAS 12.0-RC1 will include the latest state of FreeBSD 12-STABLE with all the patches, inclding this one, practically getting very close to the upcoming FreeBSD 12.2-RELEASE. You may try the latest nightly builds to check the latest code.

There were some previous reports about NVMe interrupt timeouts, including virtualized setups, but not so many. NVMe driver should be able to handle some amount of missing interrupts, but unless there are too many of them, (like interrupts are not functioning at all) it should not block booting. Please try to enable verbose messages in boot menu and then attach the output in some way to the created ticket. If system finally manage to boot -- attach FreeNAS debug (System -> Advanced -> Save Debug).
 

suhl

Cadet
Joined
Sep 8, 2020
Messages
4
Thanks Guys. I'll file a bug report as you requested with all the data. I'll try the nightly builds as well. System is not able to boot with the connected NVMe device now, so I won't be able to get the debug report, but I will be able to get the verbose boot messages for sure which I'll include in the bug report.
 

suhl

Cadet
Joined
Sep 8, 2020
Messages
4
In case somebody will stumble upon this thread from Google search results I'm posting a workaround that helped in my situation. I've found it on STH forum.
The trick was to change the following file: /boot/loader.conf on my TrueNAS instance and add the following line:
hw.pci.honor_msi_blacklist=0
 
Top