Persistant High CPU Usage / Disk Usage

Status
Not open for further replies.

jkingaround

Dabbler
Joined
Sep 12, 2016
Messages
13
Hi all,

Running FreeNAS 11.1 U6, woke up in the middle of the night to a horrible beeping coming from my system. All the disk activity lights were lit up solid, super high cpu usage and disk usage. Checked "top" and kernel was using 170% cpu or something ridiculous. Did a reboot but came right back up to high cpu. Checked the debug log, saw a few messages regarding alert.py. I saw various things use a lot of cpu: python 3.6 and looks like it was middlewared or something. i attempted to do a save debug (based on googling and seeing people were saying to do this to help find the error) but webui just sat forever while my system happily beeped its ass off at me. I eventually shut everything down (the system was getting hot) so I'm just wondering what I can provide / where i can look to help figure this error out. please help, my system is shut off until i can figure this out.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
If the system was hot to the point where your case was warm to the touch, I would suspect hardware failure. I would check to ensure that all fans are running, including in the power supply, and there is no dust/other buildup.

As far as what made it fire up the disks, a scheduled scrub is a possibility.

Can you please post complete hardware specifications? More detail is better, so include things that seem mundane like case and power supply models.
 

jkingaround

Dabbler
Joined
Sep 12, 2016
Messages
13
Case: SUPERMICRO 4U 846E16-R1200B
Mobo: X8DTE-F
RAM: 32 GB ECC
CPU: Dual Intel XEON L5520
Storage: 16 x 4TB WD Red RAID Z2 (2 pools of 8x each), 1 x 256GB Samsung 840 EVO SSD for boot
PSU: Corsair TX750

Fans have added inline resistors to quiet them down but haven't had any issues with the system getting warm or having any issue with CPUs. normally they're around 20% max. so it's very concerning to see it so high.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
How significant was the fan speed drop? I was under the impression that the inline-resistor method was only acceptable to use on a 2 or 3-pin direct DC driven fan, as it didn't play nicely with PWM. Software fan speed control should be used with those.

I'm assuming that a scrub was kicked off, which caused all of your drives to "light up" - in turn, this generated a lot of heat, and your system tried to spool up the fans, but the resistor may have limited the current and not allowed them to reach the necessary speed.

You may want to check the IPMI event log, which you should be able to do with the motherboard just connected to "standby power" rather than booting the system.
 

jkingaround

Dabbler
Joined
Sep 12, 2016
Messages
13
Thanks for the reply however I don't believe it to be an issue with the fans. I've had this setup for a year or two, run multiple scrubs and nothing like this has ever happened. I believe it to be a software issue causing FreeNAS to spin its wheels endlessly trying to do something and eating all the CPU power. It occurs as soon as FreeNAS boots up and no scrub seems to be currently running. The boot pauses for a very long time at "Starting Consul" and shoots the CPU up very high, but it doesn't seem to be tied to a specific process, htop lists it as just "kernel".
 

jkingaround

Dabbler
Joined
Sep 12, 2016
Messages
13
took a closer look and turns out two scrubs (one of each pool) were running at the exact same time and it was too much for the system to handle. killed them both and started just one and seems to be much better. updated my scrub schedule to give them enough time apart from each other so hopefully that should fix any issues. looks like the scrubs were initiated before bootup finished and was causing lag.

annoying.
 
Status
Not open for further replies.
Top