Major performance issue after upgrade to 12.0-U6

Klontje

Dabbler
Joined
Feb 7, 2016
Messages
47
Hello all,

I just upgraded my TrueNAS system from 12.0-U4.1 to 12.0-U6. After the reboot it was terribly slow, opening a folder over SMB takes 4-5 seconds and opening a web app from a jail is almost impossible. GUI is also very slow. Obviously I tried reverting back to the previous boot environment (U4.1) but to my surprise that now has the same problem.

My system is an Intel Xeon E1245v6 on a Supermicro X11SSH-CTF with 64GB of RAM. Boot drives are 2* 128GB SSD's (connected to on-board AHCI SATA controller and running as a mirror) and I have 2 storage pools: 1 consists of 2* 512GB SSD in MIRROR (24% used) and a 6-disk RAIDZ2 with a spare consisting of 8TB WD Red Plus drives. All pools are healthy and all disks are passing LONG smart tests.

I also tried using IPMI and see how local performance is and run some diagnostics, and doing an ls on any directory is taking 2-3 seconds which is not normal. The top command shows 53G of RAM as available, and 95% CPU idle. Starting top also takes a few seconds, which usually goes faster.

Looking at the checks and statistics the system should perform normal. Any hints and clues as to where to look to find the issue is much appreciated.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Sounds like thermal throttling. Has your CPU cooler come loose? Has your thermal compound dried out? Did the CPU fan die?
 

Klontje

Dabbler
Joined
Feb 7, 2016
Messages
47
Hi Samuel, that would mean that CPU (or any other) temperature reading is over or close to the limit. According to IPMI that is not the case:

CPU TempNormal28 degrees C
PCH TempNormal34 degrees C
System TempNormal32 degrees C
Peripheral TempNormal35 degrees C
MB_10G TempNormal43 degrees C
VcpuVRM TempNormal34 degrees C
DIMMA1 TempNormal29 degrees C
DIMMA2 TempNormal29 degrees C
DIMMB1 TempNormal28 degrees C
DIMMB2 TempNormal27 degrees C
 

Klontje

Dabbler
Joined
Feb 7, 2016
Messages
47
Ok, update here, system came to a slow and grinding halt and wasn't responding at all. After a complete pulling the powerplug and re-inserting it (resetting the entire system including the IPMI/BMC) the system came back online and I set it to U2.1 (as I wasnt sure if I was running U2.1 or U4.1 before the issues started). It came back flying like it did before.

Then I tried to move back to version U6, so set U6 as active and rebooted. Same issues again. Now retrying and see if I can get into U4.1 without performance issues. It seems like something is done to the system that only reverts after a complete power cycle and is not 'unset' with a reboot.

UPDATE: Behavior came back even after another complete shutdown, power cycle, and boot directly into U2.1. This seems like a hardware issue, but the question is what. I can't seem to be able to figure out what the culprit is as everything looks as it should.
 
Last edited:

Klontje

Dabbler
Joined
Feb 7, 2016
Messages
47
I upgraded the BMC firmware and BIOS firmware to the latest version as they hadn't been updated since 2016. After this I reconfigured my BIOS and BMC settings, and booted into u6 without issues. I am not going to attempt another reboot tonight as I really need some sleep, but any ideas as to where I could look would be helpful.

UPDATE: And before I could shutdown the issues are back. Please let me know if anybody has any clue as to what might be happening here.
 

Klontje

Dabbler
Joined
Feb 7, 2016
Messages
47
Final post: always check your jails...

The culprit is a jail with a 'temporary' influxdb that was still collecting 5s timeseries data and on boot was trying to load its database (4 months worth of data) which bogged down the whole system. It baffles me that one process in a jail can make an entire FreeBSD system unresponsive, which IMHO makes the jails not really suitable for running things in. I guess I will be moving my jails to proper VM's sometime.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You can set resource limits for jails. Just need to do it :wink:

Probably on the CLI - try iocage get all <jailname> to get an overview of what is available. We use it in production.
 

Ralms

Dabbler
Joined
Jan 28, 2019
Messages
29
Final post: always check your jails...

The culprit is a jail with a 'temporary' influxdb that was still collecting 5s timeseries data and on boot was trying to load its database (4 months worth of data) which bogged down the whole system. It baffles me that one process in a jail can make an entire FreeBSD system unresponsive, which IMHO makes the jails not really suitable for running things in. I guess I will be moving my jails to proper VM's sometime.
Out of curiosity, did you see either high CPU or storage usage during that initial loading from influxdb?
 

Klontje

Dabbler
Joined
Feb 7, 2016
Messages
47
Out of curiosity, did you see either high CPU or storage usage during that initial loading from influxdb?
That's the funny thing, I was focusing on CPU and memory, as those are usually what make a system unresponsive. I also didn't see any massive disk activity in terms of throughput but I did notice a slightly raised disk queue length. Problem also was that the system was so slow it wasn't letting my dive into it very deep before completely killing itself.

I have had this issue with FreeBSD before where a single process is hogging up so much I/O somewhere and is just killing the entire system to the point you start to think it's hardware related.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I have had this issue with FreeBSD before where a single process is hogging up so much I/O somewhere and is just killing the entire system to the point you start to think it's hardware related.

Yeah, if you don't provide limits, it'll happily let you use it all. :smile:
 
Top