Disks temperature reporting showing the same data points

Joined
May 14, 2023
Messages
8
Hello,

I'm having a bit of a weird problem on my Truenas Scale where my disk graphs temperatures are displayed on the Truenas web interface, but the data points never change (i.e. the disk temperatures (max/mean/min) are all the same and never change), despite the different readings I get through smartctl.

I've tried stopping collectd and rddcached and removing all rdd files from /var/db/system/rdd-randomchars/localhost and restarting the service, the graphs regenerate but the temperature data points still don't update. I've also tried disabling and re-enabling the smart feature on each individual disk, and this time I get a change in the report graph, but then the data points still remain intact.

This is what I get from running smartctl on all disks:

Code:
root@truenas[~]# for x in {a..g} ; do echo sb$x ; smartctl -a /dev/sd$x | grep -i temperature ; done
sba
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       39 (Min/Max 19/48)
sbb
194 Temperature_Celsius     0x0002   151   151   000    Old_age   Always       -       43 (Min/Max 19/48)
sbc
194 Temperature_Celsius     0x0002   144   144   000    Old_age   Always       -       45 (Min/Max 19/51)
sbd
194 Temperature_Celsius     0x0002   147   147   000    Old_age   Always       -       44 (Min/Max 19/48)
sbe
190 Airflow_Temperature_Cel 0x0032   066   051   000    Old_age   Always       -       34
sbf
194 Temperature_Celsius     0x0000   100   100   000    Old_age   Offline      -       40 (Min/Max 14/60)
sbg
190 Airflow_Temperature_Cel 0x0032   067   052   000    Old_age   Always       -       33



And this is what the disks temperature looks in the web interface:

sda.png

sdb.png

sdc.png

sdd.png

sde.png

sdf.png

sdg.png


Looking at the disktemp-sd* directory under /var/db/system/rdd-randomchars/localhost, I can see that the rdd files are being created from time to time, but the data points are always the same.


Is this a known issue? Is there anything that can be done to fix this and get the correct updated temp readings in the web interface?


Thanks!
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
As I recall, the data period being displayed is what the data represents, not the overall from power on. Click on the -zoom button a few times if you have some runtime on TrueNAS. But it appears to be working fine from your posting. If you would like a daily report showing all this data, check out the link in my signature.
 
Joined
May 14, 2023
Messages
8
But that would show differences in the temp data, assuming there were changes in the data during the reporting period, wouldn't it? Clicking in the zoom in or out show no changes in the reporting data.

For example, I have created a load on the system for an hour to increase the speed of the fans, which smartctl shows a significant reduction, and yet the reports still show the same disk temps :
Code:
root@truenas[~]# for x in {a..g} ; do echo sb$x ; smartctl -a /dev/sd$x | grep -i temperature ; done
sba
194 Temperature_Celsius     0x0002   185   185   000    Old_age   Always       -       35 (Min/Max 19/48)
sbb
194 Temperature_Celsius     0x0002   171   171   000    Old_age   Always       -       38 (Min/Max 19/48)
sbc
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       39 (Min/Max 19/51)
sbd
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       39 (Min/Max 19/48)
sbe
190 Airflow_Temperature_Cel 0x0032   067   051   000    Old_age   Always       -       33
sbf
194 Temperature_Celsius     0x0000   100   100   000    Old_age   Offline      -       40 (Min/Max 14/60)
sbg
190 Airflow_Temperature_Cel 0x0032   068   052   000    Old_age   Always       -       32


The system has been running for 3 days and prior to the cleanup of the rdd data, there was no change in the reported disc temperatures. All data points show the same temperature.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
So your SMART data does show the variation as you have said. The charts you have shown are for a 1 hour period of time. You need to change that to Days for example, see my screen shot attached. Click on the -zoom button a few times. You should be able to display several days of data in the window.
Screenshot 2023-05-14 094454.jpg
 
Joined
May 14, 2023
Messages
8
I did what you suggested (my rdd graphs were reset a few hours ago using the above procedure), but still no change, temperature shows a constant line regardless of the period selected (months, weeks, days, hours).

sda.png

sda_week.png
 

awasb

Patron
Joined
Jan 11, 2021
Messages
415
A bit on the cool side, isn't it?

The best average from the famous google paper was from 37 to 43° C. :wink:
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
A bit on the cool side, isn't it?

The best average from the famous google paper was from 37 to 43° C. :wink:
Thanks for that reference. I've never read that before, good info.

I suspect or hope that the OP has a cool room to run the computer equipment in, I've been in cold rooms like that. The nominal temperature looks good.

I did what you suggested (my rdd graphs were reset a few hours ago using the above procedure), but still no change, temperature shows a constant line regardless of the period selected (months, weeks, days, hours).

View attachment 66621
View attachment 66622
If your data was reset, then you need to collect more data. Give it a few days then look again at the chart. Something else you can do to validate the data is to power off the NAS, wait 20 seconds, power the NAS back on. Do not wait for the drives to cool down or you will spoil the experiment. This will reset the Drive Min/Max temps and you can then watch to see how they track the temp changes. I think this will prove if the drive temp data is working correctly. But you need to give it several days to track if that is what you desire. You can also start a SCRUB to generate a higher drive temp, but it may not go up much if your drives are constantly spinning.
 
Joined
May 14, 2023
Messages
8
A bit on the cool side, isn't it?

The best average from the famous google paper was from 37 to 43° C. :wink:
This is due to the disk temp data points not being updated, my averages are warmer:

Code:
sda
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -       39 (Min/Max 19/48)
sdb
194 Temperature_Celsius     0x0002   154   154   000    Old_age   Always       -       42 (Min/Max 19/48)
sdc
194 Temperature_Celsius     0x0002   144   144   000    Old_age   Always       -       45 (Min/Max 19/51)
sdd
194 Temperature_Celsius     0x0002   147   147   000    Old_age   Always       -       44 (Min/Max 19/48)
sde
190 Airflow_Temperature_Cel 0x0032   067   052   000    Old_age   Always       -       33
sdf
190 Airflow_Temperature_Cel 0x0032   066   051   000    Old_age   Always       -       34
sdg
194 Temperature_Celsius     0x0000   100   100   000    Old_age   Offline      -       40 (Min/Max 14/60)


I don't live in a fridge or a DC, but the outside temperatures aren't particularly warm for this time of year, so that also helps (and the two Noctua fans I have pointing at the disks) :)
Thanks for that reference. I've never read that before, good info.

I suspect or hope that the OP has a cool room to run the computer equipment in, I've been in cold rooms like that. The nominal temperature looks good.


If your data was reset, then you need to collect more data. Give it a few days then look again at the chart. Something else you can do to validate the data is to power off the NAS, wait 20 seconds, power the NAS back on. Do not wait for the drives to cool down or you will spoil the experiment. This will reset the Drive Min/Max temps and you can then watch to see how they track the temp changes. I think this will prove if the drive temp data is working correctly. But you need to give it several days to track if that is what you desire. You can also start a SCRUB to generate a higher drive temp, but it may not go up much if your drives are constantly spinning.


I did what you suggested and I see the temps have moved a bit, going to wait for a few more days to see if the temps get changed, going to keep this thread updated.

Thanks!
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
A SCRUB will cause a heat spike, so you can run a SCRUB and the drive temps should go up during the SCRUB duration and then return to normal after the SCRUB. The two fans blowing air across the drives is a good thing. The thing to know here is in most situations, it only needs to be a little bit of air flow, not high speed fans. Low speed fans can dissipate most of the drive heat. Why do I mention it? In case you have a system that runs a little loud, there is an option later when the fans need to be replaced to use quiet fans that run slower.

Keep us updated.
 
Joined
May 14, 2023
Messages
8
It's been a week, and the data point on the temperature graphs hasn't moved an inch, even though the temperatures have fluctuated throughout the week:

sda.png

sdb.png

sdc.png

sdd.png

sde.png

sdf.png

sdg.png


I think it is safe to assume that there is something wrong with the way the data points are handled, because smartclt reports different numbers:

for x in {a..g} ; do echo sd$x ; smartctl -a /dev/sd$x | grep -i temperature ; done sda 194 Temperature_Celsius 0x0002 175 175 000 Old_age Always - 37 (Min/Max 19/48) sdb 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always - 39 (Min/Max 19/48) sdc 194 Temperature_Celsius 0x0002 158 158 000 Old_age Always - 41 (Min/Max 19/51) sdd 194 Temperature_Celsius 0x0002 158 158 000 Old_age Always - 41 (Min/Max 19/48) sde 190 Airflow_Temperature_Cel 0x0032 068 052 000 Old_age Always - 32 sdf 190 Airflow_Temperature_Cel 0x0032 067 051 000 Old_age Always - 33 sdg 194 Temperature_Celsius 0x0000 100 100 000 Old_age Offline - 40 (Min/Max 14/60)


Does anyone have any idea what can be done, if anything, to correct this?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Did you clear the browser caches?
Also iirc there is an option to restart the service in the settings.
 
Joined
May 14, 2023
Messages
8
I did that, tried it in several browsers, same result.
Restarted rdd, collectd and didn't produce anything.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Joined
May 14, 2023
Messages
8
I haven't filed a bug yet because I don't have a registered account (a little reluctant to do so, to be honest).
Anyway, I noticed that I'm able to move the chart datapoints and synchronize them with the real temps whenever the smartctl of each disk is disabled and re-enabled or the S.M.A.R.T. in the system service is restarted.

Unfortunately this only updates for the 5m window where the service was restarted and then the datapoints don't move again and show the same temps all the time.


sda.png

sdb.png

sdc.png

sdd.png

sde.png

sdf.png

sdg.png



Does anyone have any idea what might be causing this?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
That is definitely some kind of bug. Odds are no one has even noticed it up until now. Scale is still not that mature so little bugs like this are bound to exist. It would be nice to hear from someone else using Scale to report if they also have seen this issue.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
As temporary workaround you can restart the service every hour by setting up a cronjob.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I have the same issue, did anyone find a way to resolve it?
Either switching to CORE or periodically restarting the service.

If none files a bug report the probabilities of it fixing by itself are low.
 
Top