Drastic perfomance drop every 5 minutes

icecoke

Cadet
Joined
Sep 6, 2012
Messages
3
Hello everyone!

We are using 4 FreeNAS-11.3-U1 nodes (each with 128GB RAM, 20 xeon cores, 16 Intel Enterprise SSDs, 10Gbit/s NICs), each with raidz3 and mirror logs as NFS v4 storage for a Xen Cluster.

The overall performance is quite well, but we see (since setup of the FreeNAS nodes) the following effect:

2020-04-09_17-34-44.png


The complete traffic from and to the nodes drop exactly every 300 seconds. Not in full */5 minute terms as like from cron jobs, but exactly 5 minutes (as seen in the screenshot - here 17:27:50, next drop 17:32:50 and so on).

We have no cronjobs corresponding on the Xen Cluster nor we can see such a job in FreeNAS. The effect ist, that all VMs nearly stop to work in these seconds and looking at the FreeNAS dashboard, the bandwith of the main NIC is dropping each time, too (from ~300-500MB/s to kilobytes). After a few seconds all is normal again.



We can't believe that this is normal and we would really, really appreciate any help and input regarding this!

Many thanks in advance!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So you're doing block storage on RAIDZ? Well that's not great.

https://www.ixsystems.com/community/threads/the-path-to-success-for-block-storage.81165/

Anyways, you haven't really provided any information about your environment and that makes it more difficult. Please take a gander up to the Forum Rules, conveniently linked at the top of every page in red, for some specific advice in formulating a helpful problem report.

You've failed to make any mention of the hardware in use except in very vague terms. We want to see something more like

I've got a Supermicro X10DRL-LN4F in a SC846BE16 chassis, with 8 x 16GB sticks of DDR 2400 and two E5-2637 v3's, connected with a retail LSI 9240-8i crossflashed to IT mode and running 20.00.07.00 via a single SFF8087 cable, with an Intel X520-SR2 hooked to Cisco switchgear, and the pool is built out of two dozen 3.84TB D3-S4610's. We've set up NFSv4 multipathing and (bla bla bla).

Information such as what happens under lower load, especially if you can isolate a workload that might be causing the behaviour, could be interesting.
 

icecoke

Cadet
Joined
Sep 6, 2012
Messages
3
Sorry for not providing enough hardware information.

Our Hardware:
Supermicro X11DPi-N(T) with dual Intel(R) Xeon(R) Silver 4210 CPU in a Chenbro RM23616 Chassis, with 4x 32GB DDR4-2666 Samsung M393A4K40CB2-CTD RDIMM ECC, an Broadcom 9440-8i (AVAGO MegaRAID SAS FreeBSD mrsas driver version: 07.709.04.00-fbsd) with dual SFF8087 cable into the chassis backplane, two Intel® X722 onBoard connected to cisco 10Gbit/s switches. 16 Intel INTEL SSDSC2KB038T8 (D3-S4510) SSDs.

Code:
More info on the 9440-8i:
                    Versions
                ================
Product Name    : AVAGO MegaRAID SAS 9440-8i
Serial No       : SP91541732
FW Package Build: 50.5.0-1121

                    Mfg. Data
                ================
Mfg. Date       : 04/21/19
Rework Date     : 00/00/00
Revision No     : 17005
Battery FRU     : N/A

                Image Versions in Flash:
                ================
Boot Block Version : 7.02.00.00-0017
BIOS Version       : 7.05.02.0_0x07050400
NVDATA Version     : 5.0500.01-0009
FW Version         : 5.050.01-1292 


I'm aware of the Blockstorage/RAIDZ caveats, but that is probably not the root of the problem. Thank you anyway for the hint.
I could track the problem down to the middlewared, calling parallel smartctl processes to get the drive temperatures. This seems to happen about every 300 seconds. At least I can confirm 'live' that the performance is dropping while this action is happening:

Code:
root           1    0.0  0.0   5392   1036  -  ILs  28Feb20       0:00.90 - /sbin/init --
root          78    0.0  0.0   6348   2024  -  Is   28Feb20       0:00.00 |-- daemon: /usr/local/bin/middlewared[80] (daemon)
root          80    2.0  0.6 881196 782848  -  S    28Feb20     489:30.18 | `-- python3.7: middlewared (python3.7)
root       25523    2.3  0.0   8864   5040  -  DL   22:57         0:00.05 |   |-- smartctl -n never /dev/da6 -a
root       25522    1.9  0.0   8864   5040  -  D    22:57         0:00.04 |   |-- smartctl -n never /dev/da7 -a
root       25521    1.8  0.0   8864   5040  -  DL   22:57         0:00.05 |   |-- smartctl -n never /dev/da4 -a
root       25520    1.5  0.0   8864   5040  -  DL   22:57         0:00.05 |   |-- smartctl -n never /dev/da3 -a
root       25519    1.4  0.0   8864   5040  -  DL   22:57         0:00.05 |   |-- smartctl -n never /dev/da5 -a
root       25518    1.1  0.0   8864   5040  -  D    22:57         0:00.05 |   |-- smartctl -n never /dev/da2 -a
root       25517    1.0  0.0   8864   5040  -  D    22:57         0:00.05 |   |-- smartctl -n never /dev/da0 -a
root       25516    0.8  0.0   8864   5040  -  DL   22:57         0:00.04 |   |-- smartctl -n never /dev/da1 -a


In addition I'm aware that the 9440-8i is no real HBA even with JBODs, but is this massive performance effect known with any non HBA controller while doing parallel smartctl calls (the results from these calls are fine and the reports are showing temp data for all drives)?

This effect is not depending on any load situation. You can just see it more obvious if there is more load on the node.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Yeah, okay, well, now that makes more sense.

Turn off the SMART monitoring and I bet your problem clears up.

There are two things to consider here.

We want you to use a real HBA because the real HBA just passes stuff back and forth to the drives without trying to muck with it. The mrsas driver stack is known to be a bit quirky. I don't know that it IS the problem in this case, but it could well be. I have vague recollections of something like this in the past.

https://www.ixsystems.com/community...bas-and-why-cant-i-use-a-raid-controller.139/

The short form is that using the LSI HBA with IT firmware as specified isn't really an elective option.

The other possibility is that your SSD's are freezing when the SMART request is made. Since it's SATA, what I think you might try is to attach a single disk to a mainboard SATA port and see if it (on its own, in its own pool) exhibits the issue when directly connected. If so, the only remediations available would be to turn off SMART or to contact Intel and find out WTF, hoping for a firmware update in response.

I'm torn as to which thing is more likely to be broken. I know that's not much help, and I'm sorry.
 

icecoke

Cadet
Joined
Sep 6, 2012
Messages
3
Your help is qute fine - no problem!
I already tried to stop the SMART monitoring, but after a short look into the middlewared plugin code, it looks like this is independent from any SMART settings I'm able to change in the GUI. The collection of this seems not to be stoppable. Any hint for me except my own hard solution to rename the smartctl binary? ;)

If I would do this (renaming the smartctl) will freenas be able to inform me at all on drive problems? Failures of the zpool/zfs should be collected with the usual zfs tools, right?

BTW: I ordered some new LSI HBA controllers on Thursday. They are some LSI 9211-8i replicas - my next failure, or should they be fine? The given Firmware Version from your post - is this a minimal or 'has to be exactly that' informaion?

Thanks again!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The 9211-8i's have been the most popular choice around here for many years. SATA has topped out at 6Gbps with no obvious future. It is MUCH faster than HDD's can manage, and SSD is headed towards NVMe (PCIe). This is one of those strange things where you get a "perfected" technology, much like gigabit ethernet, and there's nowhere to go with it, because all the newer options don't make as much sense. Unfortunately there will come a point at which LSI declares the 6Gbps stuff "obsolete" which is going to leave us in a bad position without a good replacement.

I wasn't aware that there were multiple paths in the middleware, so you already know a bit more than I do here. If it is polling for temperatures and you can't shut that off, I'd submit a bug report linking this thread, and I think your interim solution is correct enough. If that doesn't work, replace the smartctl executable with a small shell script that returns plausible output (i.e. a bunch of echo statements with "good" output).

Damaging SMART monitoring in this manner isn't great. FreeNAS is belt-and-suspenders, because scrubs will find data problems and this will often also shake out disk issues, but SMART can be used to run routine short and long tests, which are probably more comprehensive. The world won't end if you temporarily null out the SMART monitoring, but long term it is undesirable.

You DEFINITELY want the 20.00.07.00 firmware for the 9211-8i.

You actually seem to have a good handle on this, don't take that as encouragement to do anything dumb, but I think you're on a good path and I hope this all works out.
 
Top