Fresh Bluefin 22.12 install - Applications are not running

Mufa

Cadet
Joined
Feb 6, 2023
Messages
1
Per https://www.truenas.com/community/threads/looking-for-bluefin-systems-where-apps-dont-start.106968/
System Specs at bottom.

This past weekend I upgraded my 22.02.4 install to 22.12. After the upgrade I had the Applications are not running error. I spent the day troubleshooting and reviewing the various threads on this issue and was unable to resolve the issue.
Having made no progress I decided to start fresh with a 22.12 install. When installing 22.12, elected to not wipe the boot pool, but instead to create a new boot environment.

After initial install steps, I began re-setting up the system. I made some changes compared to my original setup based on this thread (https://www.truenas.com/community/threads/bluefin-recommended-settings-and-optimizations.106034/).
Steps taken for setup
1. Imported all datasets
2. Updated Network
3. Added Users/Groups
4. Setup 3 SMB shares
5. Adjusted some datasets
6. Setup 2 Virtual Machines
7. Setup Data Protection
8. Setup 6 apps

At this everything was running great! But I shutdown the system to install the Tesla P4, which I had taken this out during the troubleshooting on the orignal upgraded 22.12 install.

After rebooting, I was now getting one of the following Alerts:
Failed to start kubernetes cluster for Applications: Cannot connect to host 127.0.0.1:6443 ssl:default [Connect call failed ('127.0.0.1', 6443)]
Failed to start kubernetes cluster for Applications: [EFAULT] Failed to configure PV/PVCs support: Cannot connect to host 127.0.0.1:6443 ssl:default [Connect call failed ('127.0.0.1', 6443)]

Below are some troubleshooting steps I tried that didn't seem to make a difference.
Several reboots
Unset App Pool, reboot and set App Pool
Bios time was set to UTC, changed to Central (Since this didn't seem to help, I have reset it back to UTC time since that seems to be recommended)
Disabled all Virtual Machines from Auto-starting

However, I did stumble upon something that behaved differently. If I shutdown the server, unplugged the cable connected to the DS4246, and then started it back up the apps worked! Well, one worked.
The rest were stuck in deploying which I assume was due have Host Path mounts pointed at the Safe pool, which was no longer connected.
However, having the bulk of my storage not connected was not ideal, and rebooting and reconnecting the DS4246 resulted in the above errors again with apps not running.

Going through the steps to disconnect the DS4246 I grabbed screenshots of the "Edit" screens of all apps and then deleted all apps except the one that worked (this was was setup with only PVC storage).
Rebooting and connecting the DS4246 again resulted in the Apps are not running.

I went through the steps again and deleted the last app, but this did not see to make a difference.
At this point I just unset the pool and left it which is where I am currently at. I did grab a debug log in the middle of all of this that I can share if requested.


System Specs:
System: Dell R730
CPU: 2xCPU E5-2680 v4 @ 2.40ghz
RAM: 251.8 GiB (ECC)
Pools:
boot: Mirror
2 x Lexar_240GB_SSD
Minecart: Mirror (ix-applications + VM Storage), 46.1% Capacity Used
2 x 465GiB - Samsung_SSD_860_EVO_500GB
Freightcar: 2xRAIDZ1 (Planned home for VM Storage, unused), 0% Capacity Used
8 x 931Gib - Crucial MX500 CT1000MX500SSD1
Safe: RAIDZ2 (Apps Host Path Storage Target, SMB Share Datasets), 12.9% Capacity Used
5 x 2.73 TiB - WDC_WD30EFRX-68EUZN0
Vault: RAIDZ2 (Media), 35.5% Capactity Used
5 x 9.1 TiB - WDC_WD101EDBZ-11B1DA0
Crypt: RAIDZ2 (Media disks from old server, unused), 0% Capacity Used
4 x 9.1 TiB - WDC_WD100EMAZ-00WJTA0
1 x 9.1 TiB - WDC_WD100EZAZ-11TDBA0
Storage Controllers:
H221 / 9207-8e connected to Netapp DS4246, Pools: Safe, Vault, Crypt
PERC H730 Mini to R730 16xSFF, Pools: boot, Minecart, Freightcar
Network Cards:
Broadcom Gigabit Ethernet BCM5720 [Ethernet Interface] (Integrated, x4) (unused)
Supermicro AOC-STGN-I2S [10GbE SFP+] (1xPCIe, 2 ports) (Both ports active)
GPUs
Matrox G200eR2 (Embedded)
Nvidia GP104GL [Tesla P4]
Nvidia Evga RTX 3050
 
Top