Jacopx
Patron
- Joined
- Feb 19, 2016
- Messages
- 367
No clue what you're doing wrong. It works for me. /srhug.
I Have solved The problem! how can I calculate an average temperature of my 8-core? Someone can help me?
No clue what you're doing wrong. It works for me. /srhug.
#! /usr/local/bin/bash # Write email header to temp file ( echo "Subject: System Temperatures INFO" echo " " ) > /var/temp_report # Define adastat function, which writes drive activity to temp file adastat () { CM=$(camcontrol cmd $1 -a "E5 00 00 00 00 00 00 00 00 00 00 00" -r - | awk '{print $10}') if [ "$CM" = "FF" ] ; then echo " SPINNING" >> /var/temp_report elif [ "$CM" = "00" ] ; then echo " IDLE" >> /var/temp_report else echo " UNKNOWN ($CM)" >> /var/temp_report fi } # Write some general information echo System Temperatures - `date` >> /var/temp_report cat /etc/version >> /var/temp_report uptime | awk '{ print "\nSystem Load:",$10,$11,$12,"\n" }' >> /var/temp_report # Write CPU temperatures echo "CPU Temperature:" >> /var/temp_report sysctl -a | egrep -E "cpu\.[0-9]+\.temp" >> /var/temp_report echo >> /var/temp_report # Write HDD temperatures and status echo "HDD Temperature:" >> /var/temp_report for i in $(sysctl -n kern.disks | awk '{for (i=NF; i!=0 ; i--) if(match($i, '/ada/')) print $i }' ) do echo -n $i: `smartctl -a /dev/$i | awk '/Temperature_Celsius/{DevTemp=$10;} /Serial Number:/{DevSerNum=$3}; /Device Model:/{DevVendor=$3; DevName=$4} \ END {printf "%s C - %s %s (%s) - ", DevTemp,DevVendor,DevName,DevSerNum }'` >> /var/temp_report; adastat $i; done # Send status email sendmail my_email_address@gmail.com < /var/temp_report rm /var/temp_report exit 0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi/HGST Travelstar Z7K500 Device Model: HGST HTS725050A7E630 Serial Number: RCF50ACE27JNPM LU WWN Device Id: 5 000cca 85edf9c2c Firmware Version: GS2OA3C0 User Capacity: 500,107,862,016 bytes [500 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Thu Jul 6 16:31:26 2017 EEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 45) seconds. Offline data collection capabilities: (0x51) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 99) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 062 Pre-fail Always - 0 2 Throughput_Performance 0x0025 100 100 040 Pre-fail Offline - 0 3 Spin_Up_Time 0x0023 240 100 033 Pre-fail Always - 1 4 Start_Stop_Count 0x0032 094 094 000 Old_age Always - 10377 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x002f 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0025 100 100 040 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 831 10 Spin_Retry_Count 0x0033 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 67 183 Runtime_Bad_Block 0x0032 100 100 001 Old_age Always - 0 184 End-to-End_Error 0x0033 100 100 097 Pre-fail Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 070 055 045 Old_age Always - 30 (Min/Max 29/42) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 3 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 655370 193 Load_Cycle_Count 0x0032 089 089 000 Old_age Always - 111196 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 100 100 000 Old_age Always - 0 223 Load_Retry_Count 0x002a 100 100 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 4 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
#!/bin/bash # Optimized for FreeBSD (FreeNAS etc.) # Save it on your share (not on the system drive!) and run as a regular task (cron) from FreeNAS GUI (as /bin/bash /path/to/script.sh), e.g. every 30 minutes. # Email will be sent only if a treshold is met, otherwise there is no output. # The script was verified against https://www.shellcheck.net/ to ensure correct syntax. #To-do: Display more about the drive, such as model and serial number, so it is easier to identify it! # Original source that was heavily edited: https://www.reddit.com/r/freenas/comments/6pmewl/temps_and_fan_speeds_in_freenas/ HDD_ALERT_LEVEL="35" CPU_ALERT_LEVEL="43" NOTIFY_EMAIL_ADDRESS="abc@efg.com" #Pre-defining variables cpu_alert_trigger=0 hdd_alert_trigger=0 # Write email header to temp file ( echo "To: $NOTIFY_EMAIL_ADDRESS" echo "Subject: CPU/HDD Temperature Warning" echo " " echo "Warning! The following components in your system are above the set temperature treshold!" echo " " ) > /tmp/cpu_hdd_temp_check.log # Check CPU temperature COUNT_CPU=$(sysctl -a | grep -c "dev.cpu.[0-9].\temperature") # Save CPU's temp to a variable array starting from 1 (not zero), so the count is increase by 1 for ((cput=1;cput<"$COUNT_CPU+1";cput++)) do CPU_TEMP[$cput]=$(sysctl dev.cpu | grep temperature | awk '{print $2}' | awk -F'[^0-9]*' '$0=$1' | awk "NR==$cput") #echo "CPU no.$cput has temperature of ${CPU_TEMP[$cput]}" CPU_DETAILED_REPORT="$CPU_DETAILED_REPORT CPU no.$cput = ${CPU_TEMP[$cput]} Celsius" if [ "${CPU_TEMP[$cput]}" -ge "$CPU_ALERT_LEVEL" ]; then #echo "CPU number $cput is at or over the limit!" cpu_alert_trigger="1" fi done # Check HDD temperature for disk in $(sysctl -n kern.disks) do HDTEMP=$(smartctl -A /dev/"$disk" | grep -i temperature | awk '{print $10}') echo "Temp of $disk is $HDTEMP." if [[ "$HDTEMP" -ge 1 ]]; then DETAILED_REPORT="$DETAILED_REPORT Temperature of $disk is $HDTEMP." if [[ "$HDTEMP" -ge "$HDD_ALERT_LEVEL" ]]; then DRIVES_OVER_LIMIT="$DRIVES_OVER_LIMIT $disk" hdd_alert_trigger="1" fi fi done # Set alert text if one of the CPU temperatures was reached. if [ $cpu_alert_trigger -eq "1" ]; then echo "CPU temperature is at or over the limit of $CPU_ALERT_LEVEL:$CPU_DETAILED_REPORT" >> /tmp/cpu_hdd_temp_check.log echo " " >> /tmp/cpu_hdd_temp_check.log fi # Set alert text if one of the HDDs temperature was reached. if [ $hdd_alert_trigger -eq "1" ]; then echo "These HDDs are over the temperature limit of $HDD_ALERT_LEVEL:$DETAILED_REPORT" >> /tmp/cpu_hdd_temp_check.log echo " " >> /tmp/cpu_hdd_temp_check.log fi # Send out an email if one of the checked parameters were reached. if [ $cpu_alert_trigger -eq "1" ] || [ $hdd_alert_trigger -eq "1" ]; then #echo "Sending out email..." sendmail -t < /tmp/cpu_hdd_temp_check.log fi # Clean-up rm /tmp/cpu_hdd_temp_check.log exit
I implemented the script that craniu3000bis posted above, and it works fine. However, I have two questions:
1. Regarding the temperature reading from the SATA SSD (ada2 in the list below). Is this really correct, or is it just a dummy value returned for SSDs?
2. Is there a logical explanation to why one disk (ada1) can stand out with a temperature 3 degrees above the rest? This seems to be a constant as it was the same yesterday when I ran the script.
HDD Temperature:
ada0: 38 C - HGST HUS724040ALE640 (PK1334PCK2X89S) - SPINNING
ada1: 41 C - HGST HUS724040ALE640 (PK1334PCK2WGBS) - SPINNING
ada2: 99 C - SATA SSD (67F407431F2400011724) - SPINNING
ada3: 38 C - HGST HUS724020ALA640 (PN2134P6KL6M1X) - SPINNING
ada4: 37 C - HGST HUS724020ALA640 (PN2134P6KK60EX) - SPINNING
Thank you for this. However, I'm getting aHi,
I have modified the script above to be set up as a cron job that will only send emails if the temperature goes above a set treshold. This is because I prefer to only be notified if somethingis going on rather than receiving regular emails that do not require action to be taken.
Code:# Check CPU temperature COUNT_CPU=$(sysctl -a | grep -c "dev.cpu.[0-9].\temperature") # Save CPU's temp to a variable array starting from 1 (not zero), so the count is increase by 1 for ((cput=1;cput<"$COUNT_CPU+1";cput++))
CPU_Temp-monitor.sh: 31: Syntax error: Bad for loop variable
error when trying to run it. I'm on 11.2-U3 and my processor is a Xeon E3-1270 v3. Suggestions?Do they report smartctl output?Sorry to necro this thread but I'm having a hard time editing the script to show temperatures of nvme disks
Yes they doDo they report smartctl output?