Boot Stuck while Reinitializing controller step after 12.0-U1 system update

blckhm

Dabbler
Joined
Sep 24, 2018
Messages
42
Hi guys,

Tonight, I updated my MainReplica from 12.0-RELEASE to 12.0-U1 update. After reboot, system stucks during boot screen.

Here is some details;
Chasis: Dell R730xd
CPU: 1x Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz
RAM: 128G
OS: ESXi 6.7.0U3
HBA: Dell HBA330 mini (Passthrough)

Code:

Adapter Selected is a Avago SAS: SAS3008(C0)

Controller Number              : 0
Controller                     : SAS3008(C0)
PCI Address                    : 00:03:00:00
SAS Address                    : 5d09466-0-92d9-8600
NVDATA Version (Default)       : 0e.00.00.36
NVDATA Version (Persistent)    : 0e.00.00.36
Firmware Product ID            : 0x2221 (IT)
Firmware Version               : 15.15.06.00
NVDATA Vendor                  : LSI
NVDATA Product ID              : Dell HBA330 Mini
BIOS Version                   : 08.35.02.00
UEFI BSD Version               : 17.07.01.00
FCODE Version                  : N/A
Board Name                     : Dell HBA330 Mini
Board Assembly                 : N/A
Board Tracer Number            : N/A


During bootup screen, it stucks with Last line in console such as "mpr0: Reinitializing controller".
1607648797490.png


After some research, I found something about X2Apic mode on this thread : https://www.truenas.com/community/t...d-stuck-mpr0-reinitializing-controller.85999/

I'll try asap but now, I set active boot env. to 12.0 RELEASE back and it works.

X2Apic already disabled on my system since it was installed.

So I could not update my truenas to 12.1 but 12.0 looks fine for now.

Here is some info.
 
Last edited:

darabontors

Cadet
Joined
Dec 19, 2020
Messages
3
Hi,

I have the same issue with LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) device hardware passed through via XCP-ng virtualization.
I didn't try to go back to 12.0, but I found a sort of workaround. I found that if I restarted the host, the first boot of my TrueNAS would go through without freezing at the Reinitializing controller line. Any subsequent reboots of my VM however would freeze up. I could reproduce this workaround multiple times, and after a successful boot, the array works perfectly. I do have the latest firmware for the LSI card, it is flashed in IT mode and was working with FreeNAS version 11.3 U5 for multiple years without any issues.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,970
Have you tried to isolate the problem between TrueNAS and ESXi, meaning if you run TrueNAS on bare metal and the problem still exists then you know it's TrueNAS, if the problem goes away then it's ESXi. If it's ESXi then maybe you could upgrade to 7.0. Of course it does already sound like the problem is TrueNAS but I wanted to put that out there in case you can do some testing.
 

darabontors

Cadet
Joined
Dec 19, 2020
Messages
3
Have you tried to isolate the problem between TrueNAS and ESXi, meaning if you run TrueNAS on bare metal and the problem still exists then you know it's TrueNAS, if the problem goes away then it's ESXi. If it's ESXi then maybe you could upgrade to 7.0. Of course it does already sound like the problem is TrueNAS but I wanted to put that out there in case you can do some testing.

I am using XCP-ng 8.2 with hardware passthrough and TrueNAS core 12.0 U1. Before the upgrade I had FreeNAS 11.3 U5 and ESXi 6.7 with hardware passthrough. It worked without an issue for years. Hardware passthrough should be the same thing XCP-ng vs ESXi because it is done in the CPU and it's the same technology. You are right in that I didn't try to bypass the virtualization. Being a production environment, I didn't have the time and tinkering luxury. I do have another controller exactly the same so I will try to reproduce the issue with TrueNAS Core 12.0 U1 on the bare metal.

It is strange that the workaround involves restarting the host, so basically a hardware reinitialization of the LSI HBA.
 
Joined
Jan 5, 2021
Messages
1
Hi,

I have the same issue with the U1 release - stalls / hangs the boot up at reinitializing the controller. I'm using XCP-NG 8.2 with a PCI passthrough of an LSI 9300-8i.

In trying to research this I came across this post https://www.truenas.com/community/resources/lsi-9300-xx-firmware-update.145/. My firmware was below the 16.00.12.00 as mentioned in that post so I did an upgrade to the patched firmware. It didn't make any difference - As per the experience of previous poster if the host is rebooted i.e. hardware re-init of the controller I'll get one clean boot. Subsequent reboots will hang on controller re-init.

I've had FreeNAS 11.3 U5 with XCP-NG 7.6 for over 2 years and the same controller without any issues.

I've just rebuilt my VM to TrueNAS 12.0 so I'll continue testing and report.
 

darabontors

Cadet
Joined
Dec 19, 2020
Messages
3
Could someone try to reproduce this issue with another hypervisor? ESXi, QEMU, etc.?

If we could isolate it to being a TrueNAS problem, maybe we should issue a bug report. Any thoughts on this?
 

blckhm

Dabbler
Joined
Sep 24, 2018
Messages
42
Could someone try to reproduce this issue with another hypervisor? ESXi, QEMU, etc.?

If we could isolate it to being a TrueNAS problem, maybe we should issue a bug report. Any thoughts on this?
Hi @darabontors,

All of my builds (3) run under esxi host. Only 1 of them having this issue.
I think the problem is not about hypervisor.
 

NmS

Cadet
Joined
Jan 11, 2021
Messages
4
Exact same issue here. I had run FreeNAS with Xenserver 7.1 for years, but decided to update everything to latest versions.

Initially upgraded XenServer to XCP-ng 8.2. I thought the upgrade broke everything. But after a fresh install of XCP-ng, with XO and a fresh install of TrueNAS 12.0-U1 it keeps hanging on reinitializing the controller when it is set to PCI-Passthrough.
  • When I remove the PCI-passthrough TrueNAS will boot.
  • When I restart XCP-NG, TrueNAS will boot with the PCI-passthrough enabled and I can use my ZFS pool just like normal.
I use the LSI9211-8i with firmware 20.00.07.00-IT

I tested also with FreeNAS 11.3-U5 and I got the same hanging on boot.

So is this an XCP-ng problem or a TrueNAS problem?
What steps can I do to find out what the cause of this is?
 

blckhm

Dabbler
Joined
Sep 24, 2018
Messages
42
have you ever tried to boot with TrueNAS-12.0, Not 12.0-U1 ?
 

NmS

Cadet
Joined
Jan 11, 2021
Messages
4
Did some testing. I am not sure this is an TrueNAS problem, because it hangs on FreeBSD 12.2-RELEASE too.

I basically made a new VM in XO like Lawrence describes in this video: https://www.youtube.com/watch?v=gk8gHYjf7rw

The VM hangs on POST when trying to install, so I could test quickly a lot of systems.

I get the following error on POST:
Code:
mps0: IOC in fault state 0x0, resetting
mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fsb
mps0: IOCcapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf ,EEDP,TransRetry,EventReplay,HostDisc>


On Debian 10.7.0 I got the following error on POST:
Code:
mpt2sas_cm0: fault_state(0x600e)!
mpt2sas_cm0: overriding NVDATA EEDPTagMode setting


FreeBSD/NAS and TrueNAS hang on POST with message: Reinitializing controller.
Debian managed to get into the installation menu. I could also see the disks on the LSI9211-i8.

So safe to say this is a XCP-ng problem? I'll see if I can install a different version of XCP-ng this weekend.
 

NmS

Cadet
Joined
Jan 11, 2021
Messages
4
have you ever tried to boot with TrueNAS-12.0, Not 12.0-U1 ?

Yeah, I tried TrueNAS-12.0, FreeNAS 11.3-U5, 11.2-RELEASE, 11.1-U2, FreeBSD 12.2-RELEASE and Debian 10.7.0.

All get stuck while booting to the installation menu with the error from OP (Reinitializing controller). I was only able to install Debian, but that too gave an error on boot.

(sorry for the double post, but I couldn't find an edit button in my post.)
 

blckhm

Dabbler
Joined
Sep 24, 2018
Messages
42
so, I think your issue is not similar to mine.

All of my systems run under esxi host with same version. All of my hba's attached to guests with passthrough.

Only the system which has "Dell HBA330" hangs on step: "mpr0: Reinitializing controller" during system boot.
 

francisaugusto

Contributor
Joined
Nov 16, 2018
Messages
153
I have a bit of the same issue:
Running 12.0-U2.1, I had FreeBSD 11 64bit as the guest OS under ESXi 7.0. It wouldn't boot. Weirdly, if I press enter on the boot screen (you know, where you can select boot options), then it boots normally.
Choosing FreeBSD 12 64bit as the guest OS prevents me from booting. no boot whatsover.
Even the jails are behaving a bit weird: they show as running on the interface, but I have to restart them to get them working.
 

phakio

Cadet
Joined
May 13, 2022
Messages
2
I hate to bring up a dead thread but I am having this issue as well on my newly deployed Dell R440. I won't have time to test it until this weekend, but I think I found an error in my setup that fixing could solve this issue.

My HBA330 card will passthrough and function, however sometimes it takes multiple reboots of my trueNAS vm (deployed on proxmox) in order to pass the "mpr0: Reinitializing controller" phase.

In my to-do list prior to launching my server, I flashed the Dell H330 Mini with HBA330 firmware, for obvious reasons. I apparently wasn't careful enough and just now realized that I probably didn't flash the proper firmware to the card, as my card is a Mono Mini and I vividly remember flashing the normal (pcie) HBA330 firmware to it... no big deal I just need to reflash with the proper firmware. I also noticed that in the iDRAC9 settings of the server, although the card is detected and works, there are numerous properties just "not supported" or "information not available". (see attatched screengrab)

I'll get to work on this in a few days and update this reply to let you know if it's a remedy to this issue... I know it seems like a mistake many wouldn't make, but you might want to double check and make sure (if you have an originally H330 card) that you flashed the proper form factors firmware...
 

Attachments

  • Screen Shot 2022-05-13 at 10.42.56 PM.png
    Screen Shot 2022-05-13 at 10.42.56 PM.png
    315.7 KB · Views: 173

phakio

Cadet
Joined
May 13, 2022
Messages
2
I hate to bring up a dead thread but I am having this issue as well on my newly deployed Dell R440. I won't have time to test it until this weekend, but I think I found an error in my setup that fixing could solve this issue.

My HBA330 card will passthrough and function, however sometimes it takes multiple reboots of my trueNAS vm (deployed on proxmox) in order to pass the "mpr0: Reinitializing controller" phase.

In my to-do list prior to launching my server, I flashed the Dell H330 Mini with HBA330 firmware, for obvious reasons. I apparently wasn't careful enough and just now realized that I probably didn't flash the proper firmware to the card, as my card is a Mono Mini and I vividly remember flashing the normal (pcie) HBA330 firmware to it... no big deal I just need to reflash with the proper firmware. I also noticed that in the iDRAC9 settings of the server, although the card is detected and works, there are numerous properties just "not supported" or "information not available". (see attatched screengrab)

I'll get to work on this in a few days and update this reply to let you know if it's a remedy to this issue... I know it seems like a mistake many wouldn't make, but you might want to double check and make sure (if you have an originally H330 card) that you flashed the proper form factors firmware...
Almost forgot to follow up, I've been busy... soryy! Since my HBA330 was working fine, I decided to actually keep the firmware as is. (if it ain't broke don't fix it)! I decided to back everything up, and update my VM to trueNAS 13.0-RELEASE... I could always restore the VM if it borked. This whole update managed to fix the "mpr0: Reinitializing controller" error, and I can now reboot my trueNAS VM without worrying about a boot hang.
 
Top