Fan Scripts for Supermicro Boards Using PID Logic

Fan Scripts for Supermicro Boards Using PID Logic 2020-08-20, previous one was missing a file

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Glorious1 submitted a new resource:

Fan Scripts for Supermicro Boards Using PID Logic - Scripts for fan fans and fanboys

Since motherboards have no way to access drive temperatures, they can’t really regulate them. The fan control scripts presented here read and respond to both drive and CPU temperatures. Such scripts are MUCH better than the control built into the boards. Here we also apply the magic of PID control. Mean drive temperature normally stays within 0.3 C of setpoint unless there is a disturbance, then within 0.5 C. The scripts ensure that your fans spin only as fast as needed to regulate...

Read more about this resource...
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
Great resource! Has this been testing on anything outside of supermicro boards? I ask because I am wondering if it is possible but just not tested or if something about the script is unique to supermicro.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Thanks. To my knowledge they have never been tried on other boards, and I very much doubt they would work as is. Fan modes are likely named and numbered differently. The bigger issue is finding out what the ipmitool raw commands are for controlling and reading mode and duty cycle. I once looked for that info for another manufacturer (can't remember which) but couldn't find it. If you can find that info for your ASRock board, we could probably get it to work. Also, you should try running ipmitool sdr and see if you get a list of sensor outputs. If not I would need to know how to get that.

Just in case by luck the raw commands are the same as supermicro's, try
ipmitool raw 0x30 0x45 0 to get mode and
ipmitool raw 0x30 0x70 0x66 0 0 to get duty cycle
If you get a response, it will be in hex (doesn't matter for mode since it's a low number).
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
Got it. I'll look more into it!
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I just found this thread where someone asked ASRock for the raw commands. See the May 5 2015 post for the reply.
They only gave him commands for setting "PWM value" which I think is duty cycle, not for reading. One command works for all fans.

Code:
ipmitool raw 0x3a 0x01 0xAA 0xBB 0xCC 0xDD 0xEE 0xFF 0xGG 0xHH

AA: CPU_FAN1(PWM Value)
BB: Reserved(Set to 0)
CC: REAR_FAN1(PWM Value)
DD: Reserved(Set to 0)
EE: FRNT_FAN1(PWM Value)
FF: Reserved(Set to 0)
GG: Reserved(Set to 0)
HH: Reserved(Set to 0)

PWM Value:	00h -> smart fan mode
01h~64h -> manual fan mode(1%~100%)

So I think if you do
ipmitool raw 0x3a 0x01 0x64 0x00 0x64 0x00 0x64 0x00 0x00 0x00
it should run the fans to full speed. If you then change the 64's to 00's it should reset them to "smart fan mode". Whatever that is.

Here's a nice page showing a bit differently. The system shown here has more fan headers, so the ones "reserved" above are used to control additional fans. Unfortunately again only control, no reading. Actually, you don't absolutely need to read duty cycle, just set it.

Which is your system like?

This page suggests you can read RPMs with the command ipmi-sensors. Can you verify that?

If these commands work for you, it should be possible to adapt one of the scripts. If I'm understanding correctly, each header is basically its own zone and can be controlled independently, which is interesting. It's quite a different setup.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I moved this to the hardware category, since it's more fitting.

Your resource has the honor of being the first to use the automagic dicussion thread feature. I like how the resources plugin takes care of a lot of things, now I just have to figure out how to move the old resources to the new format.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I moved this to the hardware category, since it's more fitting.

Your resource has the honor of being the first to use the automagic dicussion thread feature. I like how the resources plugin takes care of a lot of things, now I just have to figure out how to move the old resources to the new format.
Yes that was a nice surprise, thank you. The other surprise was that the discussion also showed up in the "New to FreeNAS?" forum, which may not be the most appropriate place. I guess that's because I gave it the 'Fundamentals' category tag; now it's in a more appropriate forum.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Yes that was a nice surprise, thank you. The other surprise was that the discussion also showed up in the "New to FreeNAS?" forum, which may not be the most appropriate place. I guess that's because I gave it the 'Fundamentals' category tag; now it's in a more appropriate forum.
Yeah, fundamentals is more basic, fundamental if you will. :p
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I have a question for people knowledgeable about drives and temperature. The point is, is it better to let cool drives warm very slowly, or to reduce cooling to a minimum and let them warm to a setpoint more quickly?

In the single-zone script, we have to increase fans for whichever cooling demand is highest - CPU or drives. When CPU is used intensely for a long time, drives can get much cooler than the setpoint, because we are letting CPU dictate fan speeds.

When the CPU goes back to a resting state, it cools very quickly. I won't get into the weeds with all the details, but the way the script is now, fan speed decreases very slowly, taking 5 or more drive cycles (maybe 25-30 minutes) to come down to a minimum where real demand holds it up again.

I have worked on a method to let the fans slow down much faster in this situation, so that they drop within a few minutes to the point where CPU demand will keep them up again. But I wonder, just because I can do it, doesn't mean I should.

Should I keep it the way it is now, and slow the fans very gradually?
 

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
Glorious1

Thank you for your code. I'm looking at it and will test it soon.
I've been using Stux's and Kevin's ones for a while.

One of your comments says "Checking temperatures and state will reset the standby timer, but it won't spin up drives in standby. If you want your drives to spin down, you will need a drive check interval longer than, and maybe twice as long as, the standby time". I can confirm this.

Do you think there's a way to avoid standby timer reset?
My system is not accessed to most of its up-time and I think drive spin down is not a bad idea (spin up/down once a day on average). A reasonable standby time in my scenario is 30 to 60 minutes and I don't think a drive check interval of 60 minutes or higher is adequate.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Do you think there's a way to avoid standby timer reset?
My system is not accessed to most of its up-time and I think drive spin down is not a bad idea (spin up/down once a day on average). A reasonable standby time in my scenario is 30 to 60 minutes and I don't think a drive check interval of 60 minutes or higher is adequate.
No, I don't think there is a way to avoid the conflict between reading temperature and allowing standby. About all you can do is use 5 minutes as standby time and set the scripts to check temperature every 9-10 minutes. Much more than 10 minutes is getting too long of an interval for controlling drive temperature.

Or, don't use standby. I was all over it before, but eventually gave up. Even my pool that has no system dataset or logging and has no jail storage will randomly spin up from time to time. There doesn't seem any rhyme or reason to it. You can see if the same thing happens in your system with spincheck.sh. It doesn't control fans, but it does read temperatures, so do the 5/10 thing while monitoring. I bet you'll see your drives spin up multiple times without you accessing them in a 24-hour period (especially 2-3 am). Freenas is designed by and for people who leave their drives spinning all the time, so I think there is no thought given to avoiding spinup in writing the system code.
 

zvans18

Dabbler
Joined
Sep 6, 2016
Messages
23
I'm having some trouble with the dual zone. It has a tendency to try to spin fans down to 6% every 50 minutes or so even though I set minimum to 30% and lowering the fan dramatically is unnecessary in the first place. Also unless I'm missing something, the CPU and PER are controlled separately but display the same New_Fan%. And I'm not sure why SCSI stuff is showing up, but it doesn't seem to affect anything so I haven't hunted or tried to filter out.

Right now it's running as root via ssh on a smb share owned by a personal account,
Code:
[root@freenas] ~# /mnt/Z1/z/spinscripts/spinpid2.sh
/mnt/Z1/z/spinscripts/spinpid2.sh: line 17: /dev/fd/5: Operation not supported

Key to drive status symbols:  * spinning;  _ standby;  ? unknown							  Version 2017-03-03

Monday, Apr 03													   CPU		 New_Fan%  New_RPM_____________________
		  ada2 ada3 ada4 ses0 Tmax Tmean   ERRc	  P	 I	  D TEMP MODE	CPU PER   FANA  FAN1  FAN2  FAN3  FAN4
01:05:13  *34  *34  *36  _0   ^36  34.67  -0.33  -1.32  0.00  -2.64   31 Full	 30  30	400   ---   600   900   600
01:12:04  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00   8.00   34 Full	 41  41	400   ---   800   900   800
01:18:55  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00   5.28   34 Full	 52  52	400   ---  1000   900  1100
01:25:46  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00   0.00   33 Full	 57  57	400   ---  1100   900  1100
01:32:37  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00   0.00   37 Full	 62  62	400   ---  1200   900  1200
01:39:29  *36  *35  *37  _0   ^37  36.00   1.00   4.00  0.00  -2.64   36 Full	 63  63	400   ---  1200   900  1300
01:46:21  *36  *35  *37  _0   ^37  36.00   1.00   4.00  0.00   0.00   34 Full	 67  67	400   ---  1300   900  1300
01:53:12  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00  -2.64   32 Full	  6   6	400   ---  1300   900  1300
02:00:03  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00   0.00   34 Full	 30  30	400   ---   600   900   700
02:06:54  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00   5.28   33 Full	 41  41	400   ---   800   900   800
02:13:46  *37  *36  *38  _0   ^38  37.00   2.00   8.00  0.00   5.36   33 Full	 54  54	400   ---  1000   900  1100
02:20:37  *37  *36  *38  _0   ^38  37.00   2.00   8.00  0.00   0.00   33 Full	 62  62	400   ---  1200   900  1200
02:27:29  *36  *36  *38  _0   ^38  36.67   1.67   6.68  0.00  -2.64   36 Full	 66  66	400   ---  1200   900  1300
02:34:20  *36  *35  *37  _0   ^37  36.00   1.00   4.00  0.00  -5.36   34 Full	 65  65	400   ---  1200   900  1300
02:41:11  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00  -2.64   33 Full	  6   6	400   ---  1200   900  1300
02:48:03  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00   0.00   33 Full	 30  30	400   ---   600   900   600
02:54:54  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00   5.28   35 Full	 41  41	400   ---   800   900   800
03:01:46  *37  *36  *38  _0   ^38  37.00   2.00   8.00  0.00   5.36   33 Full	 54  54	400   ---  1000   900  1100
03:08:37  *36  *36  *38  _0   ^38  36.67   1.67   6.68  0.00  -2.64   33 Full	 58  58	400   ---  1100   900  1200
03:15:32  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00  -2.72   33 Full	 61  61	400   ---  1200   900  1200
03:22:23  *36  *35  *37  _0   ^37  36.00   1.00   4.00  0.00  -2.64   32 Full	 62  62	400   ---  1200   900  1200
03:29:15  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00  -2.64   34 Full	  6   6	400   ---  1200   900  1200
03:36:06  *35  *35  *37  _0   ^37  35.67   0.67   2.68  0.00   0.00   34 Full	 30  30	400   ---   600   900   600
03:42:58  *36  *36  *37  _0   ^37  36.33   1.33   5.32  0.00   5.28   33 Full	 41  41	400   ---   800   900   800
03:49:49  *37  *36  *38  _0   ^38  37.00   2.00   8.00  0.00   5.36   33 Full	 54  54	400   ---  1000   900  1100

More details:
Code:
ipmitool sensor list all
CPU Temp		 | 33.000	 | degrees C  | ok	| 0.000	 | 0.000	 | 0.000	 | 95.000	| 100.000   | 100.000
System Temp	  | 32.000	 | degrees C  | ok	| -9.000	| -7.000	| -5.000	| 80.000	| 85.000	| 90.000
Peripheral Temp  | 42.000	 | degrees C  | ok	| -9.000	| -7.000	| -5.000	| 80.000	| 85.000	| 90.000
PCH Temp		 | 49.000	 | degrees C  | ok	| -11.000   | -8.000	| -5.000	| 90.000	| 95.000	| 100.000
VRM Temp		 | 38.000	 | degrees C  | ok	| -9.000	| -7.000	| -5.000	| 95.000	| 100.000   | 105.000
DIMMA1 Temp	  | 31.000	 | degrees C  | ok	| 1.000	 | 2.000	 | 4.000	 | 80.000	| 85.000	| 90.000
DIMMA2 Temp	  | 32.000	 | degrees C  | ok	| 1.000	 | 2.000	 | 4.000	 | 80.000	| 85.000	| 90.000
DIMMB1 Temp	  | 31.000	 | degrees C  | ok	| 1.000	 | 2.000	 | 4.000	 | 80.000	| 85.000	| 90.000
DIMMB2 Temp	  | 30.000	 | degrees C  | ok	| 1.000	 | 2.000	 | 4.000	 | 80.000	| 85.000	| 90.000
FAN1			 | na		 |			| na	| na		| na		| na		| na		| na		| na
FAN2			 | 1100.000   | RPM		| ok	| 200.000   | 300.000   | 400.000   | 2000.000  | 2100.000  | 2200.000
FAN3			 | 900.000	| RPM		| ok	| 200.000   | 300.000   | 400.000   | 1100.000  | 1200.000  | 1300.000
FAN4			 | 1200.000   | RPM		| ok	| 200.000   | 300.000   | 400.000   | 2000.000  | 2100.000  | 2200.000
FANA			 | 400.000	| RPM		| nc	| 200.000   | 300.000   | 400.000   | 1600.000  | 1700.000  | 1800.000
Vcpu			 | 1.692	  | Volts	  | ok	| 1.242	 | 1.260	 | 1.395	 | 1.899	 | 2.088	 | 2.106
VDIMM			| 1.453	  | Volts	  | ok	| 1.096	 | 1.124	 | 1.201	 | 1.642	 | 1.719	 | 1.747
12V			  | 12.051	 | Volts	  | ok	| 10.164	| 10.521	| 10.776	| 12.918	| 13.224	| 13.224
5VCC			 | 4.969	  | Volts	  | ok	| 4.225	 | 4.380	 | 4.473	 | 5.372	 | 5.527	 | 5.589
3.3VCC		   | 3.359	  | Volts	  | ok	| 2.804	 | 2.894	 | 2.969	 | 3.554	 | 3.659	 | 3.689
VBAT			 | 2.985	  | Volts	  | ok	| 2.400	 | 2.490	 | 2.595	 | 3.495	 | 3.600	 | 3.690
AVCC			 | 3.344	  | Volts	  | ok	| 2.399	 | 2.489	 | 2.594	 | 3.494	 | 3.599	 | 3.689
VSB			  | 3.284	  | Volts	  | ok	| 2.399	 | 2.489	 | 2.594	 | 3.494	 | 3.599	 | 3.689
Chassis Intru	| 0x0		| discrete   | 0x0000| na		| na		| na		| na		| na		| na

-FANA is stock CPU fan on a Cryorig H7 http://www.cryorig.com/h7_us.php#spec
-FAN2 is upper front intake A14 iPPC 2000 over 2 drives
-FAN3 is stock 3 pin Fractal (eventually another iPPC) on lower front intake over third, hot drive (should move it down a bay to even out the temps)
-FAN4 is rear exhaust A14 iPPC 2000
 
Last edited:

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
I'm not able to check the code at the moment. But as I recall the bmc reset code assumes there is a fan at header FAN1. Please try moving one of the fans there. You may have to set the thresholds for that header.

I assume ses4 is your 'SCSI stuff'? Please see the post for information on removing devices from the camcontrol list.

You posted an error . Is that appearing and then the script goes on? Please explain.
 

zvans18

Dabbler
Joined
Sep 6, 2016
Messages
23
Ah, I'm assuming you're reading that from email, I've heavily edited the post and replaced the code section with what represents my current situation. BMC resets and other explicit errors have stopped so far after another power cycle, but I'll swap one to FAN1 anyway to rule it out and rerun the script. But when it was happening the script would continue, yes. Also will take another look at getting rid of ses0 (I just have 3 SATA drives).

My main complaint is it setting the fan speed to lowest every 50 minutes causing drive temp fluctuation and never actually reaching my target (currently 35). Additional complaint is New_Fan% for CPU and PER are listed as the same despite the CPU being correctly at lowest speed (30%) because low CPU temperature
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Rather than me fully tracing through the code, let's see what happens when you hook a fan to FAN1. I think the whole script assumes there is one, as I assumed that would be the first header used.

Make sure the header thresholds match that fan. If that doesn't work, please post the settings part of the script so I can see your settings. And you are on a Supermicro board I assume.
 

boynep

Dabbler
Joined
Jan 9, 2012
Messages
29
Just would like to update everyone here in case someone is using the same board as I am. FAN 3 and FAN4 does not seem to be controllable using the script. Therefore I have connected all my fans to fan FAN 1 and FAN2 using splitter. It is working beautifully now.

For some reason both FAN3 and FAN4 were going full speed.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Just would like to update everyone here in case someone is using the same board as I am. FAN 3 and FAN4 does not seem to be controllable using the script. Therefore I have connected all my fans to fan FAN 1 and FAN2 using splitter. It is working beautifully now.

For some reason both FAN3 and FAN4 were going full speed.
According to SuperMicro, your board only has three 4-pin headers. I guess that's A, 1, and 2. If there are 3 and 4, those must be 3-pin headers (no PWM).
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
@zvans18 , did switching to FAN1 fix your problems?
 

boynep

Dabbler
Joined
Jan 9, 2012
Messages
29
Amazon had X10SDV-2C-TLN2F-O as model number the last "-o" I thought was a typo from seller as I couldn't find any reference in supermicro website. It is definitely 4 pwm fans labelled 1-4. The top one is fan1 one at the back is fan2 and mext to pcie is 3 and 4. And my cpu is passively cooled so I installed one of noctua fan with rubber feet and just sits on top and connected to fan 4 which goes on full speed and cools the cpu great.

Sent from my SM-N910G using Tapatalk
 
Top