SOLVED TrueNAS 12.0-U5 upgrade fails - hangs on boot

Gcon

Explorer
Joined
Aug 1, 2015
Messages
59
Trying to upgrade from TrueNAS 12.0-U4 to U5 just now locks up after trying to bring up the network interface. See attachment. EFI boot. It's a fairly new install of TrueNAS. I think this might have been the first upgrade of this box. Never had an issue doing FreeNAS/TrueNAS upgrades before. It hangs at this point. I gave it 20 minutes. In the end hand to power cycle and it's come back up on the 12.0-U4 release (and seemingly working fine). Have tried the update twice - same deal both times.

The NIC BTW is a Dell Intel rNDC 2x X710 + 2x I350 combo daughter card. Whether that's the issue or not IDK. Something definitely is borked with this new release.
 

Attachments

  • TrueNAS 12.0-U5 lockup.png
    TrueNAS 12.0-U5 lockup.png
    68.2 KB · Views: 277

jlpellet

Patron
Joined
Mar 21, 2012
Messages
287
Applied manual update to 5 systems this morning - 4 Intel w/ Intel NIC & 1 AMD w/ Realtek NIC (all 1 GB) successfully. As always, YMMV.
 

ThreeDee

Guru
Joined
Jun 13, 2013
Messages
700
All AMD setup here (sig) and it updated just fine .. from 12.0-U4.1 though if that matters
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Trying to upgrade from TrueNAS 12.0-U4 to U5 just now locks up after trying to bring up the network interface. See attachment. EFI boot. It's a fairly new install of TrueNAS. I think this might have been the first upgrade of this box. Never had an issue doing FreeNAS/TrueNAS upgrades before. It hangs at this point. I gave it 20 minutes. In the end hand to power cycle and it's come back up on the 12.0-U4 release (and seemingly working fine). Have tried the update twice - same deal both times.

The NIC BTW is a Dell Intel rNDC 2x X710 + 2x I350 combo daughter card. Whether that's the issue or not IDK. Something definitely is borked with this new release.
Please "report a bug" with all your hardware details... glad it reverted back to U4 smoothly.
 

Gcon

Explorer
Joined
Aug 1, 2015
Messages
59
I tracked the issue down to problems with a StarTech PEX2M2 2x M.2 SATA SSD card with 2x SATA SSDs. I was going to use those SSDs as L2ARC for the main zpool, but I hadn't actually added them yet so the card was just sitting in the box. I decided to do a fresh install of 12.0-U5 and had no joy with the card still present, but taking it out and I was able to install 12.0-U5 with no problems... so I guess it was the card causing issues?! Seems like a conflict. Can'be 100% sure that -U5 introduced the issue because I wasn't doing much with the box (getting it ready for production use) but can't say 100% that it didn't either. I'm not sure.

Anyway I have another PCI card in the box that I boot from - a Startech PEXM2SAT32N1 - which I use like a Dell BOSS (same Renesas controller I believe), and I have 2x SATA SSDs mounted on that for mirrored boot pool. That card is cabled (SATA cables) to the mainboard. It has room for a third SSD on the reverse - an NVMe drive - so I'll stick an NVMe SSD on that and use that as my L2ARC then see how that goes. Hopefully no more conflicts. :)
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I tracked the issue down to problems with a StarTech PEX2M2 2x M.2 SATA SSD card with 2x SATA SSDs. I was going to use those SSDs as L2ARC for the main zpool, but I hadn't actually added them yet so the card was just sitting in the box. I decided to do a fresh install of 12.0-U5 and had no joy with the card still present, but taking it out and I was able to install 12.0-U5 with no problems... so I guess it was the card causing issues?! Seems like a conflict. Can'be 100% sure that -U5 introduced the issue because I wasn't doing much with the box (getting it ready for production use) but can't say 100% that it didn't either. I'm not sure.

Anyway I have another PCI card in the box that I boot from - a Startech PEXM2SAT32N1 - which I use like a Dell BOSS (same Renesas controller I believe), and I have 2x SATA SSDs mounted on that for mirrored boot pool. That card is cabled (SATA cables) to the mainboard. It has room for a third SSD on the reverse - an NVMe drive - so I'll stick an NVMe SSD on that and use that as my L2ARC then see how that goes. Hopefully no more conflicts. :)

Glad its resolved, but its still worth documenting the bug. With 12.0-U4, it did boot?
 

Gcon

Explorer
Joined
Aug 1, 2015
Messages
59
Glad its resolved, but its still worth documenting the bug. With 12.0-U4, it did boot?
Yes the StarTech "PEX2M2" seemed to boot OK in 12.0-U4. I've since moved the card on as I no longer need it. I only had one PEX2M2 card, and now I've made other arrangements - going from 2x 128GB M.2 SATA SSDs on the PEX2M2, to 1x 2TB M.2 NVMe SSD on the existing PEXM2SAT32N1. This is for L2ARC usage. Note that the Startech "PEXM2SAT32N1" is quite different to the "PEX2M2", and the PEXM2SAT32N1 is in the node working fine, giving me 2x SATA boot drives, and the NVMe L2ARC drive.
 

Gcon

Explorer
Joined
Aug 1, 2015
Messages
59
I have more on this unfortunately. I am still getting hangs without the StarTech PEX2M2 in the node! Looks like I fell for "post hoc, ergo propter hoc". Sorry Startech!
What is going on is lock ups immediately after the Intel X710 NIC comes up. ixl0 is the interface - check screenshots.
Two times in a row I booted up and it locked up like this.
The third time, I shut the interface off at the switch, and then it completed bootup. Coincidence?
I did my fresh install of 12.0-U5 and all was fine actually with several boots that completed fine.. But the first boot up I did when I had a VM and two jails sharing ixl0 - that's when I get lock ups.
This is the same experience with the previous build that I had two jails and a VM going fine under 12.0-U4 (or U4.1... not sure what it was exactly) and that was fine but after the upgrade to -U5 I had the lockup issues. Unfortuatnely I stuck that Startech card in the node around the same time as the upgrade, so was incorrectly blaming it.
Wonder if anyone else is having any dramas in 12.0-U5 with the intel X710 and ixl driver? Any changes to that in -U5?! The plot thickens...
 

Attachments

  • GTnas01 lockup 1.png
    GTnas01 lockup 1.png
    51 KB · Views: 244
  • GTnas01 lockup 2.png
    GTnas01 lockup 2.png
    59.6 KB · Views: 231

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I have more on this unfortunately. I am still getting hangs without the StarTech PEX2M2 in the node! Looks like I fell for "post hoc, ergo propter hoc". Sorry Startech!
What is going on is lock ups immediately after the Intel X710 NIC comes up. ixl0 is the interface - check screenshots.
Two times in a row I booted up and it locked up like this.
The third time, I shut the interface off at the switch, and then it completed bootup. Coincidence?
I did my fresh install of 12.0-U5 and all was fine actually with several boots that completed fine.. But the first boot up I did when I had a VM and two jails sharing ixl0 - that's when I get lock ups.
This is the same experience with the previous build that I had two jails and a VM going fine under 12.0-U4 (or U4.1... not sure what it was exactly) and that was fine but after the upgrade to -U5 I had the lockup issues. Unfortuatnely I stuck that Startech card in the node around the same time as the upgrade, so was incorrectly blaming it.
Wonder if anyone else is having any dramas in 12.0-U5 with the intel X710 and ixl driver? Any changes to that in -U5?! The plot thickens...
I doubt the Intel X710 causes this issue in a normal configuration. It sounds like there is a bug and it might be related to the sharing of the port between VM, Jails and maybe sharing services. Why not turn off the VMs and/or the jails and see if there is a relationship. It could be a FreeBSD networking issue...
 

Gcon

Explorer
Joined
Aug 1, 2015
Messages
59
I doubt the Intel X710 causes this issue in a normal configuration. It sounds like there is a bug and it might be related to the sharing of the port between VM, Jails and maybe sharing services. Why not turn off the VMs and/or the jails and see if there is a relationship. It could be a FreeBSD networking issue...
I'd loved to have had more time to test but I had to get the box into service Saturday afternoon and ran out of time. I took the X710 daughtercard out and put in another one (unused), and put in a Dell discrete NIC based on Intel X520-DA2, and then my problems went away. Same setup, just different NIC. I'll order a couple more M.2 SSDs for my lab R730 (ESXi host) so I can dual boot into TrueNAS and see if I can replicate the issue.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Thanks... unexpected, but I assume there is a bug in X710 driver within FreeBSD 12.x (was mistakenly referred to linux).
Has anything been reported elsewhere?
 
Last edited:

Gcon

Explorer
Joined
Aug 1, 2015
Messages
59
Thanks... unexpected, but I assume there is a bug in X710 driver within Linux.
Has anything been reported elsewhere?
With linux? I'm only using Linux in a bhyve VM. I've ordered some M.2 SSDs for the lab. Hopefully will get test this sometime in the next fortnight.
 
Top