SOLVED HDD Entering Standby When It's Not Supposed To

Status
Not open for further replies.

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
Hey folks. Last night I was experimenting with disk power management features. I had set standby and APM in the View Disks menu. I ultimately ended up setting everything back to their defaults, but my disks appear to still be entering standby (at least according to the Load Cycle Count reported from smartctl)

This is FreeNAS 11.1, running on an Asrock 2550 board, with two WD Reds in a mirror and a couple of flash drives for boot.

Here are my current settings in the Storage tab:

Screen Shot 2018-01-26 at 9.04.21 AM.png


Here's the info on my drives:

Screen Shot 2018-01-26 at 9.05.22 AM.png


And here are the stats on them:

Screen Shot 2018-01-26 at 9.04.37 AM.png


Screen Shot 2018-01-26 at 9.04.56 AM.png


Before I went to bed last night, the LCC on both were around 335 and 332, respectively.

I rebooted after setting everything back to their defaults, but the values don't appear to be sticking. To be fair though, I'm not sure how fast the LCC was incrementing before I made my changes. I wasn't paying attention to it before I started messing around with the power management settings, unfortunately.

Any advice?
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
I'm starting to think that too :-/

This morning, I've tried the following:
  • setting the timeout via the cli: ataidle -I 300 /dev/ada0
  • changing the HDD Standby value in the GUI, saving, then setting it back to Always On
  • Moving the system dataset to the boot pool
  • Shutdown / Restart
The count keeps ticking up after each one of those changes.

Thing is, these drives were manufactured in Oct 2017 and have firmware 82.00A82. All the threads I've seen on here about tweaking the drive parameters are from drives around 2013-2014 and have earlier firmware versions.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
My 4TB models have the same firmware, the heads park only once per hour
so I have no need to change the timer. I have not done any math on your data,
but as fast as the LCC seems to be climbing, they appear to be "set" to 8 seconds :eek:
Code:
=== START OF INFORMATION SECTION ===
Model Family:	 Western Digital Red
Device Model:	 WDC WD40EFRX-68WT0N0
Serial Number:	WD-deleted
LU WWN Device Id: 5 0014ee 20d3b4ae0
Firmware Version: 82.00A82
User Capacity:	4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	5400 rpm
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:	Fri Jan 26 10:21:43 2018 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

 

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
I've been watching this since yesterday, and the thing I don't understand is every time I check, it appears that the disks are spinning:

Code:
root@freenas:~ # camcontrol cmd ada0 -a "E5 00 00 00 00 00 00 00 00 00 00 00" -r -

50 00 00 00 00 00 00 00 00 FF 00

root@freenas:~ # camcontrol cmd ada1 -a "E5 00 00 00 00 00 00 00 00 00 00 00" -r -

50 00 00 00 00 00 00 00 00 FF 00



From what I've read in other posts on this forum, the FF in the second to last position indicate the drives are spinning. 00 in that position would indicate standby.
 

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
Here are the latest smart values from this morning:

ada0:

Code:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate	 0x002f   100   253   051	Pre-fail  Always	   -	   0

  3 Spin_Up_Time			0x0027   175   172   021	Pre-fail  Always	   -	   4225

  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   15

  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0

  7 Seek_Error_Rate		 0x002e   100   253   000	Old_age   Always	   -	   0

  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   205

 10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0

 11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0

 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   15

192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   1

193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   1032

194 Temperature_Celsius	 0x0022   118   112   000	Old_age   Always	   -	   29

196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0

197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0

198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0

199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0

200 Multi_Zone_Error_Rate   0x0008   100   253   000	Old_age   Offline	  -	   0


ada1

Code:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate	 0x002f   100   253   051	Pre-fail  Always	   -	   0

  3 Spin_Up_Time			0x0027   174   173   021	Pre-fail  Always	   -	   4266

  4 Start_Stop_Count		0x0032   100   100   000	Old_age   Always	   -	   12

  5 Reallocated_Sector_Ct   0x0033   200   200   140	Pre-fail  Always	   -	   0

  7 Seek_Error_Rate		 0x002e   100   253   000	Old_age   Always	   -	   0

  9 Power_On_Hours		  0x0032   100   100   000	Old_age   Always	   -	   205

 10 Spin_Retry_Count		0x0032   100   253   000	Old_age   Always	   -	   0

 11 Calibration_Retry_Count 0x0032   100   253   000	Old_age   Always	   -	   0

 12 Power_Cycle_Count	   0x0032   100   100   000	Old_age   Always	   -	   12

192 Power-Off_Retract_Count 0x0032   200   200   000	Old_age   Always	   -	   0

193 Load_Cycle_Count		0x0032   200   200   000	Old_age   Always	   -	   1030

194 Temperature_Celsius	 0x0022   117   111   000	Old_age   Always	   -	   30

196 Reallocated_Event_Count 0x0032   200   200   000	Old_age   Always	   -	   0

197 Current_Pending_Sector  0x0032   200   200   000	Old_age   Always	   -	   0

198 Offline_Uncorrectable   0x0030   100   253   000	Old_age   Offline	  -	   0

199 UDMA_CRC_Error_Count	0x0032   200   200   000	Old_age   Always	   -	   0

200 Multi_Zone_Error_Rate   0x0008   100   253   000	Old_age   Offline	  -	   0
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Please read the thread BigDave posted the link to. Head parking is nothing to do with power management. It continues in fully powered drives when no read or write operation occurs in a certain time. Possibly 8 seconds with your drives.

Ataidle seems to be a utility to set power management. Wdidle3 (or something like that - see the thread) is a WD utility to set the head parking time, and is what you want. It isn't part of FreeNAS and you will either have to reboot your FreeNAS machine into DOS and use wdidle or possibly temporarily put the drives in another machine.
 

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
Should I just not worry about this then? It seems like I'm going out of my way to adjust a value that the mfg has intentionally set to work this way.

Furthermore, how would I even reboot my FreeNAS machine into DOS? I assume I'd need some kind of WinPE image loaded with the wdidle3 utility. I suppose I'd have to go create that manually too, then. (I've already tried attaching the drive to a windows machine, and it didn't recognize the drive, of course.)

AND there doesn't seem to be a version of the wdidle3 utility I can run natively from within freenas. There is the open source idle3-tools project, but it requires making the executable, and make doesn't work on my freenas box.

It just seems like a lot of work to go to, to change something that WD has obviously configured that way for a reason.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
Thanks @Redcoat - per the forum thread above, I saw that someone mentioned that ultimate boot cd comes with the utility pre-loaded. So I just ran it, and the utility reported back the following:

Capture.PNG
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
This is probably satisfactory. If the load cycle count has not suddenly started going up but has been steadily climbing during the life of your drives then it only represents about 5 parking episodes per hour. WD suggest it should be kept below 300,000 (though some users on the thread mentioned seem to have figures in the millions with no failures) and it should take about seven years for your drives to reach that value by my (fallible) arithmetic. My FreeNAS machine, which is constantly receiving data, has only reached figures of about 400 load cycles after 21,000 hours. Your machine seems to spend more time idling for five minutes. Perhaps you are not storing the system dataset on the relevant pool. But I don't think you have a problem.
 

storage-junkie

Dabbler
Joined
Jan 17, 2018
Messages
44
So I may have arrived at a solution for this. I again moved the system dataset to the freenas-boot pool (a pair of mirrored 32gb sandisk thumb drives). Doing that caused the LCC count on my storage pool drives to stop climbing. What seemed to be happening is the heads would park after 5 minutes per the drive firmware, and then every 15-20 minutes or so freenas would write a few k to the system dataset, which would cause them to un-park.

I'm not running any jails or anything special on this box - the only services I have running are SMB, SMART, and SSH. I have read that you can wear out thumb drives by having the system dataset on them, so I'm going to monitor the disk activity on those for a few days to get a sense of how much data it's writing. I may just end up getting a dedicated SSD for that, and also configuring the syslog and reporting info to save there as well.

This server will be an offsite standby for me. Its only purpose is to be a replication target for my primary server. Since it's going to be idle 99% of the time, I set the storage pool drives to spin down after 30 minutes of inactivity. I did that through the HDD Standby settings in the GUI, and it seems to work. I heard the drives spindown, and verified they were stopped by using the camcontrol command. I let the server sit for about 5 hours without being accessed, and verified that the drives were still spun down and that the LCC counts hadn't moved.

The title of this thread was a bit misleading - I thought the climbing LCC was because the drive was going in and out of standby, when in fact it was just the heads parking and un-parking. Thanks @rogerh for the clarification on that point, and to @BigDave for the link to the other thread that had some great information on this. I'm also linking to the ultimate boot cd in case others stumble across this thread, because that thing has a ton of useful stuff on it, including the wdidle3 utility.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
I see an SSD in your future.
 
Status
Not open for further replies.
Top