HDD Temps Abruptly Stopped Reporting

supersean

Cadet
Joined
May 7, 2020
Messages
6
Hi Guys,

I'm kind of at a standstill here, any guidance would be very helpful. Even a compass direction :)

After about 3/4/2020, my disk temps were no longer being reported on the dashboard. I read that allowing the disks to spin down and turn off could be part of the reason (and also that there's no need to do so), so I disabled that setting. I browsed through the logs and couldn't find anything that would be immediately alarming, but I'm not very familiar with FreeNAS yet.

I appreciate any help. I'm happy to post any information that would prove helpful.

Thanks.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Last edited:

supersean

Cadet
Joined
May 7, 2020
Messages
6
yeah, I read that thread and NAS-103898 doesn't really address what I'm seeing..unless it does? I'll add a screenshot for clarity:
1588953170634.png
 

supersean

Cadet
Joined
May 7, 2020
Messages
6
Oops, I don't see a way to edit my post above -- basically even a place for me to go look for logs would be nice. I'm not knowledgeable of FreeNAS yet so I don't even know for example where the time series data (I assume it's stored as that) for disk temps is stored, and I don't know what's doing the writing, or what's reading it from the disks. If anyone could pass a few pointers about what logs to check I'd be very happy to try to find my own solution.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
OK, so you have the display for the temperatures but its not updating ... (I didn't get that from your first post).

I don't have the answers to your questions.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Where's your system dataset? If it's on the boot zpool, you may be out of space. All the graph data is stored in /var/db/collectd/rrd/localhost.
 

supersean

Cadet
Joined
May 7, 2020
Messages
6
Thanks guys for the advice so far! Samuel, that's a good point... I checked here's what I'm seeing for freenas-boot:

Code:
[root@nas /]# zpool list
NAME           SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
freenas-boot   119G  6.42G   113G        -         -      -     5%  1.00x  ONLINE  -


I'm going to poke around some more, I appreciate your guys' input. Just knowing there's collectd here gets me started...
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Also, look through /var/log/middlewared.log to see if middlewared flagged anything around the time the reporting stopped.
 

supersean

Cadet
Joined
May 7, 2020
Messages
6
Thank for the quick response... I'm also wondering -- would a hostname change mess up graphing for any reason? I did change the hostname recently...o_O

Below is a snippet of the logs:

Code:
[root@nas /var/log]# grep collectd middlewared.log* | grep "2020/03/23"
middlewared.log.1:[2020/03/23 18:00:35] (DEBUG) EtcService.generate():274 - No new changes for /etc/local/collectd.conf
middlewared.log.1:[2020/03/23 18:00:37] (DEBUG) EtcService.generate():274 - No new changes for /etc/local/collectd.conf
middlewared.log.1:[2020/03/23 18:05:57] (DEBUG) ServiceService._simplecmd():287 - Calling: restart(collectd)
middlewared.log.1:[2020/03/23 18:05:57] (WARNING) ServiceService._system():309 - Command '/usr/sbin/service collectd-daemon onestop ' failed with code 1: b'collectd_daemon not running? (check /var/run/collectd-daemon.pid).\n'
middlewared.log.1:[2020/03/23 18:05:59] (INFO) EtcService.generate_all():283 - Skipping collectd group generation
middlewared.log.1:[2020/03/23 22:03:31] (DEBUG) ServiceService._simplecmd():287 - Calling: restart(collectd)
middlewared.log.1:[2020/03/23 22:03:31] (WARNING) ServiceService._system():309 - Command '/usr/sbin/service collectd-daemon onestop ' failed with code 1: b'collectd_daemon not running? (check /var/run/collectd-daemon.pid).\n'
middlewared.log.1:[2020/03/23 22:03:33] (INFO) EtcService.generate_all():283 - Skipping collectd group generation



And as some food for thought, I checked the temperature graphs today... It looks like all of the disks started showing temperature data 2020/05/05 at 12:30 pacific time -- except for one! To re-iterate from my initial post: The only changes I've done were on the 7th, which (to paraphrse) involved disabling the disks' power-saving settings.
 

supersean

Cadet
Joined
May 7, 2020
Messages
6
To continue from the bottom of my recent post -- the last disk that wasn't showing temperature data DID have power-saving settings turned on. I think this is because when I disabled the settings on the 7th (or was it the 5th? o_O ) I performed a bulk-edit from the WebGUI on all 7 disks. 6 are HDDs, but the disk in question is an SSD.

I went back and checked when I made changes to the disk temp settings ... I'm confident I changed them on the 5th, but you don't see any disk temp data in May in the screenshot I posted above! Here's a screenshot today:
1589483994275.png


I think the lesson I've learned (again) is to write down all the changes I made when. I appreciate you guys for helping me on this. I double-checked the manual too...
1589484205434.png
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Changing hostnames shouldn't affect collectd, so long as you did it through the GUI. You'll need to check /etc/hosts to see if the new name took effect at the bottom of the file.

Bulk edits have bitten me before. I only edit individual disks now.
 
Top