Trying to upgrade from CORE, middleware doesn't start

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I decided to give SCALE a try today. Big mistake. Fortunately, I installed it to its own SSD and was able to revert.

I'm currently running CORE 12.0-U8 (not U8.1), and I was trying to install SCALE 22.02.0.1--I'd apparently missed the download for 22.02.1. My plan was to download the config file from CORE (including the password seed), install SCALE to its own SSD, boot, upload the saved config, and (so I expected) be on my way. Not so fast.

My first problem came with configuring the network interface. Under CORE, I had the two ports of my Chelsio T420 in a LAG group as lagg0, with a static IP assigned. I expected I'd boot with those ports unplugged, manually set up the LAG group at the console menu, assign its IP address, and then be able to access the GUI. No dice there. The first problem is that it needs to be called bondX rather than laggX. Of course, nothing in the console menu tells you that until you try to save it with the incorrect name, but that's easy enough to fix. The second problem is that, AFAICT, there's simply no way in the SCALE console menu to assign a static IP address to an interface. It was there in CORE, but not in SCALE. All you can do in SCALE is turn on DHCP for one (and only one) interface (and you have to manually do that, as DHCP isn't enabled by default on any interface). I ended up doing this, then setting the static IP in the GUI--awkward, but it worked.

So I'm now able to log in to the web GUI, and true to their word, iX have built a completely different interface without bothering to backport it to CORE. And they want us to believe that CORE will remain a viable product into the future... Anyway, once logged in there, I'm able to find the place to upload the saved config file, I do so, and the system reboots.

...and here's the problem. Once it reboots, toward the end of the boot sequence, it gives a bunch of errors which disappear far too quickly to make note of them, then clears the screen and gives a message that the middleware isn't running, and to press Enter for a root shell. I press Enter, and that's exactly what I get: a root shell. The network is down, I see the bond0 interface doesn't exist, and as advertised the TrueNAS middleware isn't running, so there's nothing I can do without digging around with more Linux internals than I really care to on this box. Since I didn't want to continue to have the NAS down with no idea of what to do on it, I powered it down, de-racked it (again), disassembled it to access the SSD tray (again), and reinstalled the old SSD with CORE on it. It boots, and other than suddenly being unable to communicate with the UPS, it's fine.

So what's going on here? Should this have worked? Is the LAG interface likely to have caused the problem? What can I do to track it down--ideally without tearing down the server several times to swap SSDs?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Have you tried fresh installing 12.0-U8 to your test SSD, uploading your config, and then trying the cross-update procedure to go from Core to SCALE?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I hadn't, but seems like it might be worth a try. Edit: Nope, that gives the same result. Going to try destroying the LAG, saving that configuration, then clean install of SCALE/upload config/see what happens.

Edit 2: No, still no luck. I noticed during the reboot after uploading the config file that there was an error about the ix-update service having failed, and then it gave me the same message about middleware not running. Output of journalctl -xe wasn't particularly helpful:
1655565722171.png
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
OK, I realize "install SCALE and upload the config file" isn't a "sanctioned" way to do the upgrade (though I think it should be). So back to the "official" way. Installed -U8 to what was to be the SCALE SSD, uploaded the config from my running -U8 w/o the LAG interface, rebooted. All worked fine so far. Uploaded the SCALE manual update file through the GUI, rebooted, and the same thing continues to happen. Trawling through the output of journalctl -xe, it looks like there's a problem creating symlinks for hard drives:

1655568544301.png


1655568573629.png


1655568609478.png


I'm also seeing some strange attempts to contact a LDAP server on a 10.0.0.0/8 IP, but that doesn't seem like the primary problem right now.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Two questions, though:
  • Should SCALE pick up on the LAG interface and set it up properly?
  • Should the "clean install of SCALE, upload saved config" upgrade method work? Since it's a completely different OS, it seems like starting with a clean install would be the best way to go.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Two questions, though:
  • Should SCALE pick up on the LAG interface and set it up properly?
  • Should the "clean install of SCALE, upload saved config" upgrade method work? Since it's a completely different OS, it seems like starting with a clean install would be the best way to go.
Agree on nightlies not being suitable for production.. just for testing.

The LAGG interface was another issue that I think has been found and resolved.

Lets' revisit the "clean install method" after we have resolved issues.. hopefully with SCALE 22.02.2.
The engineering focus has been on making SCALE 22.02 solid... only simple migrations done so far.
 
Top