SOLVED [SOLVED!] Plain vanilla install 22.12.3.1: Reporting not working, Apps intermittent., What's wrong?

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
This should be the most plain vanilla install possible, but lots of stuff isnt working.

Supermicro 2U server:
  • 2x Xeon X5680 CPUs (3.3 Ghz, 6 cores, 12 threads each)
  • 192 GB ECC RAM
  • Boots off 2x M.2 SATA drives
  • 4x 3TB SATA III drives

Install went great. No problems. Put all 4 drives in a since raidz2 pool. SMB great (~100 MiB/s on GB ethernet). Default settings, not encryption on pool or datasets.

But the graphs under Reporting and such just dont work. I've tried changing the graph points, nothing makes it work.

1688345756968.png



Kubernetes is very flakey. Sometimes apps will deploy. Sometimes they get stuck deploying forever.

1688345894262.png


Stuff I've tried:
  • Re-installed SCALE a few times
  • Rebooted a dozen times
  • Tried with default settings on everything
  • Tried with TrueNAS on DHCP and a static IP
  • Tried with Node IP 0.0.0.0 and also a a specific IP and an alias
  • Tried with and without Enable Host Path Safety Checks
  • Deleted all apps, unset apps pool, deleted ix-applications, reboot, set pool, same thing, rinse, repeat...
Nothing seems to make it reliable. Here's some things I'm seeing with apps:
  • In general:
    • When I can get apps to deploy and become active, they often go back into deploying state after a restart of TrueNAS
    • All the apps take much longer than it seems they should -- 20 minutes maybe even for small ones
  • Nextcloud
    • Almost always gets hung deploying
    • Doesnt matter if Enable Host Path Safety Checks is on or off
    • It has actually deployed and worked once (with Enable Host Path Safety Checks off and host paths set)
  • Pihole
    • Usually deploys and I can get to admin interface.
    • But its like it's DNS is only listening on localhost:53 because it wont resolve names -- dig cant connect to it
    • It can get out to the internet though to update its ad lists
  • Nginx Proxy Manager
    • Usually deploys and I can get into admin interface

Is this par for the course for a fresh install? Is there some sort of majick I have to perform?

I'm looking forward to getting apps to run. Looks like TrueNAS SCALE would be ideal for me. Just aint working out yet.

Thanks in advance for any help!


--- Lobanz
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Seems unusual.. no other reports. Over 14,000 users of that version.

Suggest you start with one networking set-up and diagnose the 1st problem with single app.
Can you verify Internet connectivity is solid...
Document the networking set-up.

Please be specific about which apps are being used - Community or TrueCharts
Document the hardware more fully
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
Seems unusual.. no other reports. Over 14,000 users of that version.

Suggest you start with one networking set-up and diagnose the 1st problem with single app.
Can you verify Internet connectivity is solid...
Document the networking set-up.

Please be specific about which apps are being used - Community or TrueCharts
Document the hardware more fully

Yeah. Something seems off. I really want to figure it out and get past it. TrueNAS SCALE seems perfect for me. No idea why its being such a pain. I'm not doing anything weird.

The NAS part works great. But the reports have never worked. And I can't get the apps to work right at all.

Internet is solid and reliable. Very simple: cable modem -> Ubiquity EdgeRouter X -> GB Switch. Using the network interface built into the Supermicro X8DT6-F motherboard. Two network cables: one for IPMI and one for the OS.

1688388300047.png


Used bootable USB flash drive to install TrueNAS-SCALE-22.12.3.1 on a pair of M.2 SAS drives (motherboard too old for pcie nvmes -- no UEFI). M.2 drives mounted in boards that plug into pci bus, but I think they only use the PCI bus for power. Each has a SATA cable that plugs into the motherboard. Seems to work fine. TrueNAS didnt complain. Declined the 16G swap partition during install -- box has 192 GB ram -- way more than it needs.

The router is a DHCP server and was using that. Worked fine. But then I changed the TrueNAS to use static IPs so I could get aliases. All IPs on 192.168.47.0/24.

I was using the Official apps at first. The Official nextcloud usually hangs in deployment. The Official pihole usually installs and deploys (slowly), but I dont think its DNS port 53 is exported to the public network interface because external clients (dig) could not connect to it. The TrueCharts pihole installed and worked and would resolve names from the outside. Not sure what the different was.

So, I just reset the Kubernetes stuff (again):
  • Set Kubernetes to default settings (Node IP 0.0.0.0, Enable Host Path Safety Checks on)
  • Deleted all apps
  • Unset the pool
  • Deleted the ix-applications dataset
  • Restarted box
1688389757438.png


This time I tried to install the TrueCharts nextcloud. It pauses at 75% of the install and then gives an error. It did the same exact thing when I had Kubernets on a specific IP.

1688388755865.png


1688387844022.png


Here's the Mode info:
1688387891053.png


I don't have a problem reinstalling the whole thing from scratch (and I have a couple of times), but it doesn't seem to help.

Anyone have any ideas? Really wanting to get this to work!

Thanks!

--- Lobanz
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
I also get this sometimes when trying to stop Official nextcloud when it actually gets installed.

1688392272561.png
 

c77dk

Patron
Joined
Nov 27, 2019
Messages
468
have you enabled the "operators" train from TrueCharts ?
 

c77dk

Patron
Joined
Nov 27, 2019
Messages
468
hmmm, I'd try to install pg manually, and if that also fails, try reaching out to the truecharts people on their Discord
 

c77dk

Patron
Joined
Nov 27, 2019
Messages
468

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
I'm really thinking about bailing on the SCALE Kubernetes stuff and running stuff on docker in a vm. I'm really about done messing with it. I'm a software developer and a former sys/net admin and I dont think I'm doing anything wrong or weird. As plain vanilla default as you can possibly get. And there are tons of posts on here about reporting not working and Kubernetes apps hanging in deployment just like I'm seeing. No doubt it works for most. Maybe it just doesn't like older server hardware for some reason.

I might run pihole and nginx proxy manager in SCALE because they are infrastructure related I can get the TrueCharts versions to work.


--- Lobanz
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
Alright! Still fighting!

I was able to reproduce this by running TrueNAS SCALE 22.12.3.1 in a VM running on TrueNAS SCALE 22.12.3.1.

VM Hardware Config
  • Boot method: Legacy BIOS (because my Supermicro Server doesn't have UEFI)
  • CPUs: 2 CPUS, 2 cores, 2 threads
  • CPU Mode: Host Passthrough
  • Memory: 32G
  • Disks (mirrors my Supermicro Server, all Mode=AHCI, all zvols precreated sparse):
    • boot1: 120GB
    • boot2: 120GB
    • disk1: 3T
    • disk2: 3T
    • disk3: 3T
    • disk4: 3T
Took all defaults during install. Everything went well.

When it was up I added the 4x 3TB disks to a raidz2 pool called main.

Added datasets for nextcloud host path storage:
main/app-data/nextcloud/nextcloud-1/​
data​
db​
db-bak​

Went into apps, chose pool, installed nextcloud as follows (everything else default):
  • Name: nextcloud-1
  • Username: admin
  • Password: ***********
  • Install ffmpeg
  • Host Path for Nextcloud Data Volume: /mnt/main/app-data/nextcloud/nextcloud-1/data
  • Host Path for Postgres Data Volume: /mnt/main/app-data/nextcloud/nextcloud-1/db
  • Host Path for Postgres Backup Volume: /mnt/main/app-data/nextcloud/nextcloud-1/db-bak
  • Checked "Enable cronjobs for nextcloud", default schedule
  • Everything else default

So, it deployed 0/1 and gets stuck like this:

1688412315349.png


Exactly what happen on the Supermicro host. If I click on it and look the application events, it also says exactly what it did on my Supermicro host:

1688412419670.png



So this doesn't happen to anyone else?


--- Lobanz
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
OK! Progress!

Truenas SCALE (22.12.3.1) seems to work properly when running under VMware hosted on a different box. Reporting works. Nextcloud deployed without hanging with host paths.

But TrueNAS doesn't work right on my server either bare metal or in a VM (in TrueNAS). I tried various configurations in the VM:
  • Legacy BIOS boot with AHCI disks
  • UEFI boot with virtio disks
  • CPU Modes: Host Passthrough, Host Model, and Custom emulation of core2duo
  • Nextcloud deploy with or without host paths

On mine, reporting doesn't work and Nextcloud hangs on install.

Any ideas? I'm about out of stuff to try.

Thanks in advance for your help.

--- Lobanz
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
OK! Progress!

Truenas SCALE (22.12.3.1) seems to work properly when running under VMware hosted on a different box. Reporting works. Nextcloud deployed without hanging with host paths.

But TrueNAS doesn't work right on my server either bare metal or in a VM (in TrueNAS). I tried various configurations in the VM:
  • Legacy BIOS boot with AHCI disks
  • UEFI boot with virtio disks
  • CPU Modes: Host Passthrough, Host Model, and Custom emulation of core2duo
  • Nextcloud deploy with or without host paths

On mine, reporting doesn't work and Nextcloud hangs on install.

Any ideas? I'm about out of stuff to try.

Thanks in advance for your help.

--- Lobanz

Can you document the hardware again more fully.
Please include the NICs/chips used.

Do you know if the issues existed with 22.12.2?

Once you document, please report as a bug with debugs and hardware.
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
Can you document the hardware again more fully.
Please include the NICs/chips used.

Do you know if the issues existed with 22.12.2?

Once you document, please report as a bug with debugs and hardware.

I *think* I started out with 12.2.2. I know I upgraded. Reporting has never worked tho. Don't remember if I tried the app deplyment on 12.2.2. I can probably try.

I would think the following is good for hardware. I can export the debugs and attach various logs from /var/log.

HARDWARE:
  • Supermicro 2U Server
    • GENERAL INFO
      • Supermicro X8DT6-F motherboard
        • 2x Intel Xeon X5680 6 core, 3.3 Ghz CPUs
        • 192G ECC RAM (maxed out, 12x 16GB DIMMs)
        • Dual Intel 82574 Gigabit Ethernet controllers (oboard)
      • Chassis: 2U, 8 bays, dual 920W PSUs
        • 4x Hitachi HDS723030ALA640 3TB SATA III drives on the onboard LSI SAS 2008 controller via chassis backplane
        • 4 bays empty.
      • 2x 120GB M.2 SATA boot drives mounted in little PCI boards with SATA cables to motherboard.
        • Pretty sure the little PCI boards just use the PCI bus for power since they have a SATA cable to connect drive to motherboard. Works. Cheap.
        • Couldn't use PCIe NVME because this motherboard doesnt do UEFI and BIOS wont boot them (TrueNAS installed to them just fine though)
      • No PCI cards except the two little boards for the M.2 SATA boot drives.
        • Used to have another Intel I350-T2 dual port NIC and a LSI SAS9200-8e controller with only an external MiniSAS port on the back.
        • Took them both out. Just using onboard stuff. Might put network card back in for more interfaces.
    • DETAILS (abbreviated from motherboard manual)
      • CPU
        • Two Intel 5500/5600 Series (LGA 1366) processors;
      • Memory
        • Twelve 240-pin DIMM sockets support up to 192 GB of DDR3 Registered ECC or up to 48 GB of Unbuffered ECC/Non ECC Memory
      • Chipset
        • Intel 5520 chipset, including: the 5520 (IOH-36D) and the ICH10R (South
          Bridge).
      • Onboard I/O
        • Intel ICH10R supports six SATA2 ports [I'm not using any RAID features]
        • LSI 2008 SATA 2 supports eight SAS ports [I'm not using any RAID features]
        • Dual Intel 82574 Gigabit Ethernet controllers support Giga-bit LAN1/LAN 2
          ports, and Realtek PHY
        • IPMI 2.0 with full KVM support
      • BIOS
        • 4 MB AMI SPI Flash ROM
        • PCI 2.2, ACPI 1.0/2.0/3.0, Plug and Play (PnP), DMI 2.3, USB Keyboard support, and APM 1.2
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
As you indicated, it looks vanilla and just a little old. No signs of networking issues?

Is the BIOS clock set at the right time?

Please scan the logs for anything that seems unusual and report a bug with the debugs.
 
Last edited:

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
As you indicated, it looks vanilla and just a little old. No signs of networking issues?

Is the BIOS clock set at the right time?

Please scan the logs for anything that seems unusual and report a bug with the debugs.

Finishing up bug report now. Will attach debugs and any log files I think look useful.

BIOS time right on the money. NTP synced in IPMI. Local time.
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
Finishing up bug report now. Will attach debugs and any log files I think look useful.

BIOS time right on the money. NTP synced in IPMI. Local time.

Well, I said so but...

1688589409442.png



1688589443952.png


That aint right. Investigating....


--- Lobanz
 

Lobanz

Dabbler
Joined
Jul 2, 2023
Messages
16
OH MY GOSH !!!!!!!

morganL is DA MAN !!!


After I got the BIOS time in UTC, then the shell time and the time in System Settings / General / Localization said the same thing (in local time per TrueNAS time zone setting) then EVERYTHING started working! [Note: I had to go into the BIOS and change it. The IPMI NTP wasnt getting it for some reason.]

Reporting works and Nextcloud deployed perfectly with host paths and everything -- and it deployed FAST (about 1 minute). And it works!

I KNEW they were related somehow! It also explains why it didnt work on the VMs in TrueNAS and it DID work on the VMware VM. I bet thats what tipped you off to the solution.

Thank you morganL! Seriously, man: awesome work. I'm very happy and relieved that its working now.

<embarraslingly profuse thanks and praise removed>

Abandoning bug report.


--- Lobanz
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
OH MY GOSH !!!!!!!

morganL is DA MAN !!!


After I got the BIOS time in UTC, then the shell time and the time in System Settings / General / Localization said the same thing (in local time per TrueNAS time zone setting) then EVERYTHING started working! [Note: I had to go into the BIOS and change it. The IPMI NTP wasnt getting it for some reason.]

Reporting works and Nextcloud deployed perfectly with host paths and everything -- and it deployed FAST (about 1 minute). And it works!

I KNEW they were related somehow! It also explains why it didnt work on the VMs in TrueNAS and it DID work on the VMware VM. I bet thats what tipped you off to the solution.

Thank you morganL! Seriously, man: awesome work. I'm very happy and relieved that its working now.

<embarraslingly profuse thanks and praise removed>

Abandoning bug report.


--- Lobanz

I was just lucky.. another user reported issued with his local clock and I remembered that the problems it causes are non-obvious. @Kris Moore Perhaps we could add a system clock check to our Internet check plans???
 
Top