Fan Scripts for Supermicro Boards Using PID Logic

Fan Scripts for Supermicro Boards Using PID Logic 2020-08-20, previous one was missing a file

lmannyr

Contributor
Joined
Oct 11, 2015
Messages
198
Noticed I have 2 instances of spinpid2 even though none were there before running the script once.
Code:
PID TT  STAT	TIME COMMAND																									 
19717 v0  Is+  0:02.04 /usr/local/bin/python /etc/netcli (python3.6)																
19718 v1  Is+  0:00.00 /usr/libexec/getty Pc ttyv1																				 
19719 v2  Is+  0:00.00 /usr/libexec/getty Pc ttyv2																				 
19720 v3  Is+  0:00.00 /usr/libexec/getty Pc ttyv3																				 
19721 v4  Is+  0:00.00 /usr/libexec/getty Pc ttyv4																				 
19722 v5  Is+  0:00.00 /usr/libexec/getty Pc ttyv5																				 
19723 v6  Is+  0:00.00 /usr/libexec/getty Pc ttyv6																				 
19724 v7  Is+  0:00.00 /usr/libexec/getty Pc ttyv7																				 
 6946  0- S+   0:00.12 bash /mnt/drive/Scripts/spinpid2.sh																		   
 6947  0- I+   0:00.00 bash /mnt/drive/Scripts/spinpid2.sh																		   
 6948  0- I+   0:00.00 tee -i -a /mnt/drive/Scripts/spinpid2.log																	 
 8657  2  Ss   0:00.01 bash																										 
 8865  2  R+   0:00.00 ps	
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,367
Of course, it doesn't make much sense to try to cool a CPU to 30C.

Perhaps you should aim for something more realistic, like 45 or 50C? Any temperature where the CPU can maximum turbo is fine.

IIRC, I actually aim for <75C on my Xeon D.

And yes, I need the ability to spin up the HD fans to help the CPU cooling in my systems, if I'm *really* hammering the CPU (ie mprime torture tests)
 

lmannyr

Contributor
Joined
Oct 11, 2015
Messages
198
The script is resetting on peripheral fans though. Do you think my 35c cpu is the cause of the script restarting?
 

lmannyr

Contributor
Joined
Oct 11, 2015
Messages
198
Update to my BMC loop reset from above.

I reverted back v2017-01-08. That version worked perfectly for about a year until updating it to the current version. I tried 2 previous version back from the current version that also behaved the same (an endless bmc reset loop). Maybe the reset is too sensitive(?) compared to v2017-01-08. I liked the new version until it starts to reset bmc. I'm willing to do some testing again with you if you want to figure this out. I guess I'm the only one having this issue? Is there anyone else with the X10SL7-F MB using the current script without issue?

I think It may be caused by the a 100 rpm variable in 1 of the 3 peripheral fans. All 3 fans are the same brand and model but one tends to be about 100 rpm faster than the other two. Not sure why but maybe the script senses that and tries to reset bmc in an attempt to correct.

Anyhow, willing to test if you want.
 

Bennyhaha68

Cadet
Joined
Jan 4, 2017
Messages
2
Thank you for this great script!

All 3 scripts on the forum seem to query the 10th item in line 221 in spinpid2.sh.

I am using HGST Ultrastar 7K6000 4TB drives and needed to make a change to line 221 for the script to work properly, I commented the original and added mine below it:

Code:
#TEMP=$( grep "Temperature_Celsius" /var/tempfile | awk '{print $10}')
TEMP=$( grep "Current Drive Temperature" /var/tempfile | awk '{print $4}')


I wanted to share in case anyone else wanted to use the script with these drives.

Thank you.

Log output:

Code:
****** SETTINGS ******
CPU zone 1; Peripheral zone 0
CPU fans min/max duty cycle: 30/100
PER fans min/max duty cycle: 30/100
CPU fans - measured RPMs at 30 0x1.45p+10nd 100 1900uty cycle: /
PER fans - measured RPMs at 30 0x1.c2p+9nd 100 2800uty cycle: /
Drive temperature setpoint (C): 35
Kp=4, Ki=0, Kd=40
Drive check interval (main cycle; minutes): 5
CPU check interval (seconds): 2
CPU reference temperature (C): 30
CPU scalar: 6

Key to drive status symbols:  * spinning;  _ standby;  ? unknown							  Version 2018-01-01 

Sunday, Feb 11																 CPU		 New_Fan%  New_RPM_____________________ 
		  da0  da1  da2  da3  da4  cd0  Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
12:25:16  *39  *39  *39  *39  *39  _0   ^39  39.00   4.00  16.00  0.00  32.00   31 Full	 50  98   1700  2800  2800  2800  2100
12:31:22  *37  *36  *36  *36  *36  _0   ^37  36.20   1.20   4.80  0.00 -22.40   29 Full	 30  80   1300  2300  2300  2300  2100
12:37:24  *36  *35  *35  *35  *35  _0   ^36  35.20   0.20   0.80  0.00  -8.00   32 Full	 42  73   1600  2100  2100  2200  2100
12:43:28  *36  *34  *35  *35  *35  _0   ^36  35.00   0.00   0.00  0.00  -1.60   32 Full	 42  71   1500  2100  2100  2100  2100
12:49:33  *35  *34  *34  *34  *34  _0   ^35  34.20  -0.80  -3.20  0.00  -6.40   32 Full	 42  61   1500  1800  1800  1800  2100
12:55:36  *36  *34  *34  *34  *34  _0   ^36  34.40  -0.60  -2.40  0.00   1.60   31 Full	 36  60   1400  1800  1800  1800  2100
13:01:39  *35  *34  *34  *34  *34  _0   ^35  34.20  -0.80  -3.20  0.00  -1.60   30 Full	 30  55   1300  1700  1700  1700  2100


My fans at FAN4 are (2) 80mm at the exhuast of my case and are not PWM.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Thank you for this great script!

All 3 scripts on the forum seem to query the 10th item in line 221 in spinpid2.sh.

I am using HGST Ultrastar 7K6000 4TB drives and needed to make a change to line 221 for the script to work properly, I commented the original and added mine below it:

Code:
#TEMP=$( grep "Temperature_Celsius" /var/tempfile | awk '{print $10}')
TEMP=$( grep "Current Drive Temperature" /var/tempfile | awk '{print $4}')
Interesting, thanks for sharing. If someone had a mix of drives including those HGSTs, it would be a real pain. You would need to choose the appropriate way to retrieve the temperature based on detecting drive model.

By the way, if you want to fix that error in lines 4-5 of the Settings printout, just add another '%' after 30 and 100 in lines 313 and 314 of the script, so "30%% and 100%%". That will be fixed in next version.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Update to my BMC loop reset from above.

I reverted back v2017-01-08. That version worked perfectly for about a year until updating it to the current version. I tried 2 previous version back from the current version that also behaved the same (an endless bmc reset loop). Maybe the reset is too sensitive(?) compared to v2017-01-08. I liked the new version until it starts to reset bmc. I'm willing to do some testing again with you if you want to figure this out. I guess I'm the only one having this issue? Is there anyone else with the X10SL7-F MB using the current script without issue?

I think It may be caused by the a 100 rpm variable in 1 of the 3 peripheral fans. All 3 fans are the same brand and model but one tends to be about 100 rpm faster than the other two. Not sure why but maybe the script senses that and tries to reset bmc in an attempt to correct.

Anyhow, willing to test if you want.
I never heard if you got the fan header thresholds all set correctly. I would still think that's the problem. Also make certain which fans are connected to which headers so you're not setting thresholds for the wrong header. If it's still resetting, make the thresholds more liberal, so higher max and lower min, for the zone that is causing the reset.

I can look into differences between 2017-01-08 and current version in the reset logic. But I don't really have any ideas to base testing on at this point. I haven't heard of this being a widespread problem.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
@lmannyr, I looked at the 2017-01-08 version of spinpid2.sh. It looks like the bmc reset thing was aspirational at that point and that code was commented out. It was later better developed. The bmc reset in the latest version is lines 423-436. If you feel it is safe (at your own risk) you could comment that out or adjust the values used. Just monitor temps and fans for a while after doing that and make sure every thing seems right.
 

Grinchy

Explorer
Joined
Aug 5, 2017
Messages
78
First of alle, thank you for this great script!

How do you start this script? If i set it as postinit, it will Stop my NAS from booting cause the script doesn't habe an end, where the boot would continue.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
That is odd. I do start spinpid.sh as a postinit task, and it works fine. Have you tried it?
 

cadamwil

Explorer
Joined
Sep 6, 2013
Messages
60
I have got the scripts to run, but I am often seeing
Code:
Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70 rsp=0xcc): Invalid data field in request
and the fan speeds do not seem to be changing.
 
Last edited:

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I have got the scripts to run, but I am often seeing
Code:
Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70 rsp=0xcc): Invalid data field in request
and the fan speeds do not seem to be changing.
Which script is this?
I'm not enough of a server-head to tell what the items in your signature are. Which is the motherboard? Is it a server-grade board? Does it have IPMI?
 

cadamwil

Explorer
Joined
Sep 6, 2013
Messages
60
My motherboard is a Supermicro X9DRi-LN4F+. It has IPMI. I receive that alert when running spintest.sh and spincheck.sh Also, keep in mind, I have limited Unix/Linux knowledge. Also, the script doesn't seem to be creating or filling logs, so I may be running it with the wrong permissions.
 
Last edited:

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Thanks. And you're running it as root or with sudo I guess?
spintest.sh generates no log, it just outputs to the terminal.
However spincheck.sh should generate a log in the directory specified near the top of the file, which you would need to edit:
Code:
# Creates logfile and sends all stdout and stderr to the log,
# leaving the previous contents in place. If you want to append to existing log,
# add '-a' to the tee command.
# Change to your desired log location/name:
LOG=/mnt/MyPool/MyDataSet/MyDirectory/spincheck.log

I think the execute permissions survive upload and download. But not totally sure. The permissions should look like this:
Code:
-rwxr-xr-x  1 <your username> wheel   8.9K Dec 29 15:52 spincheck.sh

If you get it to run and produce a log, please post it so I can have a look.
We may have to research if that board has different raw commands than most of them.
 

cadamwil

Explorer
Joined
Sep 6, 2013
Messages
60
Thanks. And you're running it as root or with sudo I guess?
spintest.sh generates no log, it just outputs to the terminal.
However spincheck.sh should generate a log in the directory specified near the top of the file, which you would need to edit:
Code:
# Creates logfile and sends all stdout and stderr to the log,
# leaving the previous contents in place. If you want to append to existing log,
# add '-a' to the tee command.
# Change to your desired log location/name:
LOG=/mnt/MyPool/MyDataSet/MyDirectory/spincheck.log

I think the execute permissions survive upload and download. But not totally sure. The permissions should look like this:
Code:
-rwxr-xr-x  1 <your username> wheel   8.9K Dec 29 15:52 spincheck.sh

If you get it to run and produce a log, please post it so I can have a look.
We may have to research if that board has different raw commands than most of them.

So, I corrected the log issue, I think it was either permissions on the SH file or my path to the log file location, which I changed.
I am wondering if part of my issue, is I have more than 4 fans, I have FAN1-FAN6 and then FANA-B. The fans I currently use are FAN1,2,3,5 & 6. Of course this may not be an issue, but either way I would like to control those fans too. Thanks for your help.
 

Attachments

  • spincheck.txt
    1.5 KB · Views: 371

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
So, I corrected the log issue, I think it was either permissions on the SH file or my path to the log file location, which I changed.
I am wondering if part of my issue, is I have more than 4 fans, I have FAN1-FAN6 and then FANA-B. The fans I currently use are FAN1,2,3,5 & 6. Of course this may not be an issue, but either way I would like to control those fans too. Thanks for your help.
FWIW, I asked them... and according to Supermicro technical support, the X9 series motherboards do not support the IPMI raw command used in the these scripts to set the zone duty cycle.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Wow, @Spearfoot really came through. Don't know how you get such a quick and helpful response from Supermicro!

So then the question is, does it support alternative raw commands to read and set the duty cycle (spincheck does no setting, only reading)? @cadamwil, if you want to research that, and the answer is yes, it shouldn't be too hard to modify the scripts accordingly.

FYI, it shouldn't matter how many fans there are in a zone. They should all be controlled together.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Wow, @Spearfoot really came through. Don't know how you get such a quick and helpful response from Supermicro!

So then the question is, does it support alternative raw commands to read and set the duty cycle (spincheck does no setting, only reading)? @cadamwil, if you want to research that, and the answer is yes, it shouldn't be too hard to modify the scripts accordingly.

FYI, it shouldn't matter how many fans there are in a zone. They should all be controlled together.
Supermicro tech support will answer polite questions, it just takes them a day or two or three to respond... I asked about this last April.

You can indeed set the fan mode on the X9-series motherboards: full-speed, standard, optimal, heavy-IO.

I wrote a simple script that I run as a chron job to poll the CPU and HDD temps and set the mode appropriately. Here are my notes on the fan modes:
Code:
# Normally, one sets the fan mode on a Supermicro system to either STANDARD,
# OPTIMAL, or HEAVY I/O and the BMC will then adjust the fan duty cycle according
# to CPU temperature only, without taking hard drive temperatures into account.
#
# In FULL SPEED mode, all fans run at 100%. CPU (and hard drive) temperature
# monitoring are irrelevant as there is nothing more to be done when it comes
# to cooling the system. We're spinning as fast as we can, Cap'n!
#
# In STANDARD, OPTIMAL, and HEAVY I/O modes the CPU and peripheral zone fans
# run at the target rates shown below, with the CPU zone rate varying from the
# target rate up to 100%, depending on CPU temperature. The peripheral zone rate
# is fixed in all cases.
#
#				 Target rates:
# Modes:		  CPU zone  Peripheral zone
# -------------   --------  ---------------
# 00  Standard	  50%		 50%
# 01  Full		  100%		100%
# 02  Optimal	   30%		 30%
# 04  Heavy I/O	 50%		 75%
#
# IPMI raw commands:
#
# Get fan mode: raw 0x30 0x45 0x00
# Set fan mode: raw 0x30 0x45 0x01 [x] 
 

cadamwil

Explorer
Joined
Sep 6, 2013
Messages
60
Wow, @Spearfoot really came through. Don't know how you get such a quick and helpful response from Supermicro!

So then the question is, does it support alternative raw commands to read and set the duty cycle (spincheck does no setting, only reading)? @cadamwil, if you want to research that, and the answer is yes, it shouldn't be too hard to modify the scripts accordingly.

FYI, it shouldn't matter how many fans there are in a zone. They should all be controlled together.
How did you get the RAW commands to tweak the fan PCM rates? If emailing Supermicro would help or I also have asked for the whitepaper on the Winbond WPCM450, which I think is where the fan control lives on my motherboard.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
How did you get the RAW commands to tweak the fan PCM rates? If emailing Supermicro would help or I also have asked for the whitepaper on the Winbond WPCM450, which I think is where the fan control lives on my motherboard.
Well, you're not going to be able to set the PCM (Pulse Code Modulation) rates/duty cycle on X9 motherboards; they don't support this functionality. I wanted to do so, which is why I verified w/ Supermicro.

All you can do is set the basic 'fan mode', as I described above, using the standard IPMI commands described and used by @Glorious1 and many others here on the forum.

This all takes place at the IPMI level, meaning that it won't help you much to delve into the gory technical details of your mobo's Winbond WPCM450 chip.
 
Top