The PID script (or a variant of it) is available for using with a Corsair Commander Pro (and maybe some of their other fan controllers now that OpenCorsairLink is more-or-less complete).Does anyone know if there has been a script like this one created for consumer baords without IPMI?
I don't know why, but I can never get cpan to install anything without hours of messing around chasing upstream dependencies, so if you can do it without the dependencies, that would be excellent...I fixed a few bugs in the script already. It seems (still) to work for me. I should mention that added dependencies on two Perl modules, Proc::Daemon and IPC::Run. If people find this particularly onerous, I can probably get rid of the dependencies.
I forked Kevin's repo as well and comitted all my changes to the new repo:
GitHub - roburban/nas_fan_control: collection of scripts to control fan speed on NAS boxes
collection of scripts to control fan speed on NAS boxes - roburban/nas_fan_controlgithub.com
There is definitely more work to be done. An incomplete list:
- get rid of all the chaff (functions no longer used)
- re-vamp the config file format:
- use standard config-file format ("key = value")
- possibly use different "profiles" with one marked active as a substitute for zillions of different .ini files
- variable-name cleanup -- use standard Perl naming convention
- use hashes for passing groups of parameters back and forth instead of individual values
-rob
I just put a version of it out there: https://www.ixsystems.com/community...k-in-a-jail-to-control-fans.71873/post-601335 for anyone interested to think about.I'm working on a collective variant which should support OpenCorsairLink, Asrock and supermicro fan control modes and will log to influxdb in addition to the logfile.
egrep '^[a]*da[0-9]+\$'
to egrep '^[a]*da[0-9]'
my $command = "/usr/local/sbin/smartctl -A $disk_dev | grep Temperature_Celsius";
to my $command = `/usr/local/sbin/smartctl -A $disk_dev | grep "Drive Temperature"`;
my $output = `$command`;
to my $output = $command;
to avoid double quoting.my $temp = "$vals[9]"
to my $temp = "$vals[3]"
PID Fan Controller Log --- Target 4 Disk HD Temperature = 38.00 deg C --- PID Control Gains: Kp = 5.333, Ki = 0.000, Kd = 48.0 Max Ave Temp Fan Fan Fan % CPU P I D Fan 2021-08-08 da0 da1 da2 da3 da4 da5 da6 da7 da8 da9 da10 Temp Temp Err Mode RPM Old/New Temp Corr Corr Corr Duty 22:01:27 34 34 33 34 36 33 34 34 32 37 34 ^37 35.25 -2.75 Full 900 20/20 39 -22.00 0.00 16.00 20.00% 22:02:57 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 40 -26.00 -0.00 -16.00 20.00% 22:04:28 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 37 -26.00 -0.00 0.00 20.00% 22:05:57 33 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 40 -26.00 -0.00 0.00 20.00% 22:07:28 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 38 -26.00 -0.00 0.00 20.00% 22:08:58 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 38 -26.00 -0.00 0.00 20.00% 22:10:27 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 42 -26.00 -0.00 0.00 20.00% 22:11:57 34 34 33 34 36 33 34 34 32 37 34 ^37 35.25 -2.75 Full 900 20/20 40 -22.00 -0.00 16.00 20.00% 22:13:27 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 41 -26.00 -0.00 -16.00 20.00% 22:14:57 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 40 -26.00 -0.00 0.00 20.00% 22:16:27 34 34 33 34 36 33 34 34 32 37 34 ^37 35.25 -2.75 Full 900 20/20 39 -22.00 -0.00 16.00 20.00% 22:17:57 34 34 33 34 34 33 34 34 32 37 34 ^37 34.75 -3.25 Full 900 20/20 39 -26.00 -0.00 -16.00 20.00%
PID Fan Controller Log --- Target 6 Disk HD Temperature = 39.00 deg C --- PID Control Gains: Kp = 5.333, Ki = 0.000, Kd = 48.0 Max Ave Temp Fan Fan Fan % CPU P I D Fan 2021-08-13 da0 da1 da2 da3 da4 da5 da6 da7 da8 da9 da10 Temp Temp Err Mode RPM Old/New Temp Corr Corr Corr Duty 20:00:09 38 37 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 40 -6.67 -0.00 0.00 10.00% 20:01:39 38 38 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 43 -6.67 -0.00 0.00 10.00% 20:03:10 38 37 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 43 -6.67 -0.00 0.00 10.00% 20:04:39 38 38 36 38 39 37 38 38 34 39 37 ^39 38.33 -0.67 Full 2200 10/10 42 -5.33 -0.00 5.33 10.00% 20:06:09 38 38 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 44 -6.67 -0.00 -5.33 10.00% 20:07:39 38 38 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 40 -6.67 -0.00 0.00 10.00% 20:09:09 38 38 36 38 39 37 38 38 34 39 37 ^39 38.33 -0.67 Full 2200 10/10 43 -5.33 -0.00 5.33 10.00% 20:10:39 38 37 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 42 -6.67 -0.00 -5.33 10.00% 20:12:09 38 37 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 42 -6.67 -0.00 0.00 10.00% 20:13:39 38 38 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 43 -6.67 -0.00 0.00 10.00% 20:15:09 38 37 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 41 -6.67 -0.00 0.00 10.00% 20:16:39 38 37 36 38 39 37 38 38 34 38 37 ^39 38.17 -0.83 Full 2200 10/10 41 -6.67 -0.00 0.00 10.00%
# set_fan_mode("full");
$hd_fan_duty_high = 100; # percentage on, ie 100% is full speed. $hd_fan_duty_med_high = 74; $hd_fan_duty_med_low = 48; $hd_fan_duty_low = 22; # some 120mm fans stall below 30.
PID Fan Controller Log --- Target HD Temperature = 36.50 deg C --- PID Control Gains: Kp = 5.333, Ki = 0.000, Kd = 120.0 Max Ave Temp Fan Fan Fan % CPU P I D Fan 2022-03-19ada0 ada1 ada2 ada3 Temp Temp Err Mode RPM Old/New Temp Corr Corr Corr Duty 18:00:21 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 0.00 0.00 22.00% 18:03:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 35 -40.00 -0.00 0.00 22.00% 18:06:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 0.00 0.00 22.00% 18:09:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 35 -36.00 -0.00 10.00 22.00% 18:12:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 -10.00 22.00% 18:15:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 -0.00 0.00 22.00% 18:18:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 0.00 22.00% 18:21:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 -0.00 0.00 22.00% 18:24:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 36 -36.00 -0.00 10.00 22.00% 18:27:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 36 -36.00 -0.00 0.00 22.00% 18:30:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 -10.00 22.00% 18:33:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 0.00 22.00% 18:36:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 0.00 22.00% 18:39:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 38 -36.00 -0.00 10.00 22.00% 18:42:21 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 0.00 -10.00 22.00% 18:45:21 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 35 -40.00 -0.00 0.00 22.00%
"Temp Err" is the difference between the target HD temperature and the average HD temperature. The last line has "-2.50", which means the average HD temperature is 2.5 degrees colder than the large of 36.5I've read the thread linked in the script and most of the posts on the thread from Stux.
But I still have some questions. $debug is set to 4 and I want to understand the log:
Code:PID Fan Controller Log --- Target HD Temperature = 36.50 deg C --- PID Control Gains: Kp = 5.333, Ki = 0.000, Kd = 120.0 Max Ave Temp Fan Fan Fan % CPU P I D Fan 2022-03-19ada0 ada1 ada2 ada3 Temp Temp Err Mode RPM Old/New Temp Corr Corr Corr Duty 18:00:21 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 0.00 0.00 22.00% 18:03:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 35 -40.00 -0.00 0.00 22.00% 18:06:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 0.00 0.00 22.00% 18:09:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 35 -36.00 -0.00 10.00 22.00% 18:12:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 -10.00 22.00% 18:15:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 -0.00 0.00 22.00% 18:18:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 0.00 22.00% 18:21:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 -0.00 0.00 22.00% 18:24:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 36 -36.00 -0.00 10.00 22.00% 18:27:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 36 -36.00 -0.00 0.00 22.00% 18:30:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 -10.00 22.00% 18:33:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 0.00 22.00% 18:36:20 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 36 -40.00 -0.00 0.00 22.00% 18:39:20 33 34 36 34 ^36 34.25 -2.25 Opt 200 22/22 38 -36.00 -0.00 10.00 22.00% 18:42:21 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 37 -40.00 0.00 -10.00 22.00% 18:45:21 33 33 36 34 ^36 34.00 -2.50 Opt 200 22/22 35 -40.00 -0.00 0.00 22.00%
What does the columns "Temp Err", "P", "I" and "D" exactly telling me?
Yeah, "P", "I" and "D" stand for proportional, integral and derivative but what is the consequence for the fan control?
There are also correction values in the script ("Kp", "Ki", "Kd") and I just don't know if or when I need to change these values.
Last thing is: the column "FAN RPM" shows just the wrong value. The fans are actualy spinning with 500-600 RPM but are shown as 200.
PID Fan Controller Log --- Target HD Temperature = 36.50 deg C --- PID Control Gains: Kp = 5.333, Ki = 0.000, Kd = 120.0 Max Ave Temp Fan Fan Fan % CPU P I D Fan 2022-03-20ada0 ada1 ada2 ada3 Temp Temp Diff Mode RPM Old/New Temp Corr Corr Corr Duty 15:26:04 33 33 36 34 ^36 34.00 -2.50 Opt 500 100/60 46 -40.00 -0.00 0.00 60.00% 15:29:04 33 33 36 34 ^36 34.00 -2.50 Opt 1900 60/22 35 -40.00 0.00 0.00 22.00% 15:32:04 33 32 36 34 ^36 33.75 -2.75 Opt 600 22/22 35 -44.00 -0.00 -10.00 22.00% 15:35:04 33 33 36 34 ^36 34.00 -2.50 Opt 600 22/22 37 -40.00 -0.00 10.00 22.00% 15:38:04 33 33 36 34 ^36 34.00 -2.50 Opt 600 22/22 34 -40.00 -0.00 0.00 22.00%
Unfortunately the extracts from the log and the debug log cover different times, so it is not possible to compare them. The log extract shows CPU temps from 34 to 37 for most of the time, with a short duration spike up to 61. Reviewing the code, the CPU temps in the log have the same source as the CPU temps in the debug log, so it is not obvious why they should differ.I tested the script and it worked initially. The fans started low and after I stressed the CPU the fans we're set to high. But after I stopped stressing the fans the fans didn't reset to low. Also I noticed the CPU Temps are displayed wrong in the log file.
I saw this in stdout after I stopped stressing the fans:
`Unable to send RAW command (channel=0x0 netfn=0x30 lun=0x0 cmd=0x70 rsp=0xcc): Invalid data field in request`
Im not sure what the ipmitool command is that fails. Unfortunatly it isnt logged
This is my PID_fan_control.log with the wrong CPU temp numbers.
```
PID Fan Controller Log --- Target 4 Disk HD Temperature = 38.00 deg C --- PID Control Gains: Kp = 2.667, Ki = 0.000, Kd = 30.0
Max Ave Temp Fan Fan Fan % CPU P I D Fan
2022-04-15ada0 ada1 ada2 ada3 ada4 da0 Temp Temp Err Mode RPM Old/New Temp Corr Corr Corr Duty
15:37:28 36 38 37 35 37 ^38 37.00 -1.00 Full 1250 36/56 37 -4.00 -0.00 0.00 56.00%
15:39:00 35 38 37 35 37 ^38 36.75 -1.25 Full 975 56/46 53 -5.00 -0.00 -5.00 46.00%
15:40:30 35 37 37 35 37 ^37 36.50 -1.50 Full 725 46/35 61 -6.00 -0.00 -5.00 35.00%
15:42:00 35 37 37 35 37 ^37 36.50 -1.50 Full 650 35/29 59 -6.00 -0.00 0.00 29.00%
15:43:28 35 37 37 35 37 ^37 36.50 -1.50 Full 1275 29/23 42 -6.00 -0.00 0.00 23.00%
15:44:58 35 37 36 34 36 ^37 36.00 -2.00 Full 1775 23/16 37 -8.00 -0.00 -10.00 16.00%
15:46:28 35 36 36 34 36 ^36 35.75 -2.25 Full 1775 16/16 37 -9.00 -0.00 -5.00 16.00%
15:47:58 35 36 35 34 35 ^36 35.25 -2.75 Full 1775 16/16 37 -11.00 -0.00 -10.00 16.00%
15:49:28 34 35 35 33 35 ^35 34.75 -3.25 Full 1775 16/16 35 -13.00 -0.00 -10.00 16.00%
15:50:58 34 35 35 33 34 ^35 34.50 -3.50 Full 1775 16/16 37 -14.00 -0.00 -5.00 16.00%
15:52:28 34 35 34 33 34 ^35 34.25 -3.75 Full 1775 16/16 35 -15.00 -0.00 -5.00 16.00%
15:53:58 33 34 34 32 34 ^34 33.75 -4.25 Full 1775 16/16 34 -17.00 -0.00 -10.00 16.00%
15:55:29 33 34 34 32 34 ^34 33.75 -4.25 Full 1775 16/16 34 -17.00 -0.00 0.00 16.00%
15:56:58 33 34 34 32 33 ^34 33.50 -4.50 Full 1775 16/16 35 -18.00 -0.00 -5.00 16.00%
15:58:28 33 33 33 32 33 ^33 33.00 -5.00 Full 750 16/16 36 -20.00 -0.00 -10.00 16.00%
15:59:59 33 34 33 32 33 ^34 33.25 -4.75 Full 725 16/16 36 -19.00 -0.00 5.00 16.00%
```
In the DEBUG_PID_fan_control.log the cpu temp numbers are alright
```
32.0
32.0
32.0
33.0
32.0
32.0
35.0
35.0
2022-04-15 16:19:52: core_temp = 32.0 C
2022-04-15 16:19:52: core_temp = 32.0 C
2022-04-15 16:19:52: core_temp = 32.0 C
2022-04-15 16:19:52: core_temp = 33.0 C
2022-04-15 16:19:52: core_temp = 32.0 C
2022-04-15 16:19:52: core_temp = 32.0 C
2022-04-15 16:19:52: core_temp = 35.0 C
2022-04-15 16:19:52: core_temp = 35.0 C
2022-04-15 16:19:52: CPU Temp: 35.0
2022-04-15 16:19:52: CPU Fan: low
2022-04-15 16:19:53: core_temps:
31.0
31.0
34.0
34.0
32.0
32.0
35.0
35.0
2022-04-15 16:19:53: core_temp = 31.0 C
2022-04-15 16:19:53: core_temp = 31.0 C
2022-04-15 16:19:53: core_temp = 34.0 C
2022-04-15 16:19:53: core_temp = 34.0 C
2022-04-15 16:19:53: core_temp = 32.0 C
2022-04-15 16:19:53: core_temp = 32.0 C
2022-04-15 16:19:53: core_temp = 35.0 C
2022-04-15 16:19:53: core_temp = 35.0 C
2022-04-15 16:19:53: CPU Temp: 35.0
2022-04-15 16:19:53: CPU Fan: low
```
Can someone help me out here?
I have an X10SLL-FUnfortunately the extracts from the log and the debug log cover different times, so it is not possible to compare them. The log extract shows CPU temps from 34 to 37 for most of the time, with a short duration spike up to 61. Reviewing the code, the CPU temps in the log have the same source as the CPU temps in the debug log, so it is not obvious why they should differ.
I think the bigger issue is to figure out why the RAW command failed, as that is likely why the fan control stopped working. Which motherboard does your system have?
Hmm. That board should work. One more puzzle which shed some light on the problem - the log shows headers for six HDs (ada0..ada4 and da0), I only see five HD temperatures in each line. How many HDs does your system have?I have an X10SLL-F
Sorry for the late response. ada0 - 4 are my harddrives. da0 is my boot flash drive.And, which FreeNAS or TrueNAS version are you running? What is the make and model of the HDs?
camcontrol devlist
for n in 0 1 2 3 4;do;/usr/local/sbin/smartctl -A /dev/ada$n | grep Temperature_Celsius;done
ipmitool raw 0x30 0x70 0x66 0x00 0x00
ipmitool raw 0x30 0x70 0x66 0x00 0x01
<WDC WD80EFAX-68LHPN0 83.H0A83> at scbus0 target 0 lun 0 (pass0,ada0)
<WDC WD80EZAZ-11TDBA0 83.H0A83> at scbus2 target 0 lun 0 (pass1,ada1)
<WDC WD80EZAZ-11TDBA0 83.H0A83> at scbus3 target 0 lun 0 (pass2,ada2)
<WDC WD80EZAZ-11TDBA0 83.H0A83> at scbus4 target 0 lun 0 (pass3,ada3)
<WDC WD80EZAZ-11TDBA0 83.H0A83> at scbus5 target 0 lun 0 (pass4,ada4)
<AHCI SGPIO Enclosure 2.00 0001> at scbus6 target 0 lun 0 (ses0,pass5)
<SanDisk Extreme 1.00> at scbus8 target 0 lun 0 (da0,pass6)
# for n in 0 1 2 3 4; do /usr/local/sbin/smartctl -A /dev/ada$n | grep Temperature_Celsius; done
194 Temperature_Celsius 0x0002 180 180 000 Old_age Always - 36 (Min/Max 18/47)
194 Temperature_Celsius 0x0002 166 166 000 Old_age Always - 39 (Min/Max 17/47)
194 Temperature_Celsius 0x0002 175 175 000 Old_age Always - 37 (Min/Max 18/46)
194 Temperature_Celsius 0x0002 185 185 000 Old_age Always - 35 (Min/Max 17/46)
194 Temperature_Celsius 0x0002 171 171 000 Old_age Always - 38 (Min/Max 17/47)
# ipmitool raw 0x30 0x70 0x66 0x00 0x00
24
# ipmitool raw 0x30 0x70 0x66 0x00 0x01
16