Unsure of SATA drive spindown

Status
Not open for further replies.

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
I may be missing the obvious, but that output looks OK to me - there is no activity in the "music" pool so the script has stopped (or attempted to stop) the disks associated with the music pool. The "tank" pool continues to report activity, so no attempt is made to stop those disks.

I can't explain your power consumption observations, but certainly the script *seems* to be working as expected and as such I would expect you to see a reduction in power consumption once the two disks have been spun down. Can you confirm if you are able to hear the da5/da6 disks spinning down (and then spinning back up when you access them again)? If the disks aren't actually spinning down, then that would suggest a problem with camcontrol.

At the command line, try running the following commands - assuming there are no errors you should hear the disks spin down:
Code:
camcontrol stop da5
camcontrol stop da6


If you then access the pool associated with those disks, you should be able to hear them spinning up again.
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
So I monitored closely today and here are what I found:

- If the two pools are in activity the same time, they will all spindown the same time. (45W)
- After two pools are spundown, I access 1 pool (music) which wakes up that pool, other pool is still sleeping. I can here 2 disk are spinning up (56W)
- After 30 minutes the log said pool stopped successfully, but the power watch is unchanged. I then access that pool (i didn't here any spinning up, just the sound of disks reading), power still the same.

Code:
Jun  5 08:34:47 freenas sasidle[1553]: ** Stopping devices in pool "music" **
Jun  5 08:34:47 freenas sasidle[1553]: camcontrol stop da1
Jun  5 08:34:47 freenas sasidle[1553]: camcontrol stop da2
Jun  5 08:34:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0
Jun  5 08:34:56 freenas sasidle[1553]: Unit stopped successfully
Jun  5 08:34:56 freenas sasidle[1553]: Unit stopped successfully
Jun  5 08:35:47 freenas sasidle[1553]: 000: music       10.5G  2.71T      0      0      0      0
Jun  5 08:35:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0
Jun  5 08:36:47 freenas sasidle[1553]: 000: music       10.5G  2.71T      0      0      0      0
Jun  5 08:36:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0
Jun  5 08:37:47 freenas sasidle[1553]: ** Restarting devices in pool "music" due to activity **
Jun  5 08:37:47 freenas sasidle[1553]: 030: music       10.5G  2.71T      0      1  87.8K  7.87K
Jun  5 08:37:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0
Jun  5 08:38:47 freenas sasidle[1553]: 029: music       10.5G  2.71T      0      0      0      0
Jun  5 08:38:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0


- I then issued two commands

Code:
camcontrol stop da1
camcontrol stop da2


But I just hear the disks reading and no spinning down, even though the log is still showing zero values.

- Now i access 2 pools at the same time, power gone up (72W)

Code:
Jun  5 08:48:49 freenas sasidle[1579]: ** Restarting devices in pool "tank" due to activity **
Jun  5 08:48:49 freenas sasidle[1579]: 030: tank        5.03T  5.84T      0      2    136  17.5K
Jun  5 08:49:47 freenas sasidle[1553]: 030: music       10.5G  2.71T      0      1      0  7.87K
Jun  5 08:49:49 freenas sasidle[1579]: 030: tank        5.03T  5.84T      0      1      0  6.33K


- After 30 min, both pools are correctly spundown (power draw 45W)

Code:
Jun  5 09:19:49 freenas sasidle[1579]: ** Stopping devices in pool "tank" **
Jun  5 09:19:49 freenas sasidle[1579]: camcontrol stop da3
Jun  5 09:19:49 freenas sasidle[1579]: camcontrol stop da4
Jun  5 09:19:49 freenas sasidle[1579]: camcontrol stop da5
Jun  5 09:19:49 freenas sasidle[1579]: camcontrol stop da6
Jun  5 09:19:50 freenas sasidle[1579]: Unit stopped successfully
Jun  5 09:19:50 freenas last message repeated 3 times
Jun  5 09:20:47 freenas sasidle[1553]: 000: music       10.5G  2.71T      0      0      0      0
Jun  5 09:20:47 freenas sasidle[1553]: ** Stopping devices in pool "music" **
Jun  5 09:20:47 freenas sasidle[1553]: camcontrol stop da1
Jun  5 09:20:47 freenas sasidle[1553]: camcontrol stop da2
Jun  5 09:20:48 freenas sasidle[1553]: Unit stopped successfully
Jun  5 09:20:48 freenas sasidle[1553]: Unit stopped successfully
Jun  5 09:20:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0
Jun  5 09:21:47 freenas sasidle[1553]: 000: music       10.5G  2.71T      0      0      0      0
Jun  5 09:21:49 freenas sasidle[1579]: 000: tank        5.03T  5.84T      0      0      0      0



So this makes me think there 's some sort of synchronization problem in spinning down within the same HBA adapter it can make only them spindown at the same time
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
Doesn't really make much sense, as each disk should be able to spin down independently of the others, no matter which pool they belong to (the HBA obviously has no concept of ZFS pools and should be treating each disk as a completely independent device). Maybe there is a bug in the HBA firmware? One thing you could try is to add the --sync parameter which will cause discs to be stopped synchronously, one after the other, rather than all at once (although if the disks are not stopping when you run the camcontrol command manually then it's unlikely to have any effect).

I'm assuming your HBA is not operating in some sort of RAID mode? My LSI 9211-8i with IT (Initiator Target, and not Integrated RAID, IR) firmware will happily spin down disks individually or in combinations.
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
yeah, im not sure what's going on with my system. Right after my earlier post, i then accessed tank pool (music pool is still sleeping), 30min later tank pool spun down correctly. So there must be something wrong with my music pool which is mirror. Let me replace this pool with other harddrives and check again
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
Actually I'm not sure how spindown is working. It seems that the the LSI 9211-8i IT mode with latest firmware (P13) also supports spindown and your spindown script. I just turnoff your script and see how the card handle spindown function and it did spindown some of my harddrives. So I'm not sure how hardware and software control operate with each other
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
I've never had the drives spin down automatically, though I'm on the pre-P13 firmware (and my N36L cannot boot into the LSI BIOS anyway, so configuring the spindown timeouts is a complete impossibility!)
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
even in the LSI Bios, there's no option to adjust the timeouts. But i know it did spindown some of my harddrives since my power draw drops but not all the way to the bottom.
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
I forgot to mention, I had this error once before and I had it again today, so i'm not sure this is causing my HDDs strange behavior spindown with your sasidled scirpt

3CEF_4FCF8A0F.jpg
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
That's not an error I've seen before on my setup (N36L, 9211-8i, 4x2TB Samsung HD204UI, 4x500GB Samsung HM500JI). Presumably the drives were spun down when you got that error?
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
That's not an error I've seen before on my setup (N36L, 9211-8i, 4x2TB Samsung HD204UI, 4x500GB Samsung HM500JI). Presumably the drives were spun down when you got that error?

yes all the drives are spundown, and right after that my box freezes, can't access to SSH or webgui.
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
I have made a new freenas USB and used your updated script. it's all working good so far.
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
Many thanks for the positive feedback, just glad it can help some people! :)
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
I'm running to another problem. Now I start adding 4 more HDDs to my system make total of 12, 6 in HBA card and 6 in onboard SATA. When I run the script, it can see both pools on sata controler and on HBA. I understand that this script is for HDDs on HBA. So I just want to check if there is anyway that prevents the script to exclude the pools attach to sata controller. For now, I tried the arg "--devices '/dev/da[1-6]'", but it seems not working at all. In /var/log/messages, it just shows the usage of sasidle

my /conf/base/etc/rc.conf :

Code:
sasidled_enable=YES
sasidled_cmdpath=/root
sasidled_args="--devices '/dev/da[1-6]'"



Code:
[root@freenas] ~# zpool status
  pool: music
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        music       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            ada1p1  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0

errors: No known data errors

  pool: storage
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz2    ONLINE       0     0     0
            da3p2   ONLINE       0     0     0
            da2p2   ONLINE       0     0     0
            da4p2   ONLINE       0     0     0
            da1p2   ONLINE       0     0     0
            da5p2   ONLINE       0     0     0
            da6p2   ONLINE       0     0     0

errors: No known data errors 


note: you see only 2 pools since the new 4 HDDs I haven't been imported in yet

Code:
<ATA ST3000DM001-9YN1 CC9E>        at scbus3 target 0 lun 0 (da1,pass1)
<ATA ST3000DM001-9YN1 CC4C>        at scbus3 target 1 lun 0 (da2,pass2)
<ATA ST3000DM001-9YN1 CC9D>        at scbus3 target 2 lun 0 (da3,pass3)
<ATA ST3000DM001-9YN1 CC9D>        at scbus3 target 3 lun 0 (da4,pass4)
<ATA ST33000651AS CC45>            at scbus3 target 4 lun 0 (da5,pass5)
<ATA ST3000DM001-9YN1 CC9E>        at scbus3 target 5 lun 0 (da6,pass6)
<WDC WD30EZRX-00MMMB0 80.00A80>    at scbus4 target 0 lun 0 (ada0,pass7)
<WDC WD30EZRX-00MMMB0 80.00A80>    at scbus5 target 0 lun 0 (ada1,pass8)
<ST32000542AS CC34>                at scbus6 target 0 lun 0 (ada2,pass9)
<ST32000542AS CC94>                at scbus7 target 0 lun 0 (ada3,pass10)
<ST32000542AS CC34>                at scbus8 target 0 lun 0 (ada4,pass11)
<ST32000542AS CC34>                at scbus9 target 0 lun 0 (ada5,pass12)

 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
OK, it does appear that there was a small regression in sasidle when parsing --devices when specified without pool identifiers (ie. just a device list, as you are specifying) though it wasn't the cause of your problem. The regression meant that it was ignoring the devices being specified and always retrieving from the config database.

The problem you are having is that the devices argument (the one you are specifying) is being expanded as it is passed into the background job (so that "/dev/da[0-1]" becomes "/dev/da0" "/dev/da1"), and the job is interpreting "/dev/da1" as an additional and unrecognised parameter (I've added extra log output which should make this a little clearer in future).

I've now updated the script in post #79 with a fix for the aforementioned regression and this new argument expansion issue, and it should now work for you.

One observation I would make about how you are running sasidle, is that with your two pools, you don't want to specify devices that are not specific to a pool, as you will get two sasidle monitoring jobs - one for each pool - but with both jobs stopping the same disks (ie. both the music and storage jobs stopping disks /dev/da[1-6]).

You need to add "--pool storage" to sasidled_args (add it as the first argument - this is necessary due to the way I hacked around the argument expansion) so that you only monitor that pool, and not music. It's only necessary to include this argument when you have multiple pools but want to stop a subset of disks, since by default the daemon will try to monitor all available pools - by overriding the device detection, you may end up monitoring one pool and stopping disks that belong to another!

Once you are ready to monitor all of your pools, you can remove the"--pool" argument as this will then be determined automatically but you will then need to make your devices pool specific (ie. --devices 'storage:/dev/da[1-6] music:/dev/ada[0-5]').

Essentially, if you're going to specify a subset of disks with --devices rather than allow everything to be handled automatically (assuming that works!), then you also need to restrict (ie. only monitor) the pools to which those disks correspond... The following sasidle_args should work for you with the amended code in post #79 monitoring just your storage pool:

Code:
sasidled_args="--pool storage --devices '/dev/da[1-6]'"


Once you are monitoring both your pools, you should change to something like this:

Code:
sasidled_args="--devices 'storage:/dev/da[1-6] music:/dev/ada[0-5]'"


If you were to add any additional pools, they will automatically have their devices determined from the config database (which may or may not be correct - fingers crossed it is!)
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
I just tried your latest version and somehow sasidle not running at all, here's the warning message:

Code:
[root@freenas] ~# /etc/local/rc.d/sasidled status
sasidled is not running.
[root@freenas] ~# /etc/local/rc.d/sasidled start
/etc/local/rc.d/sasidled: WARNING: run_rc_command: cannot run /root/sasidle
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
Is sasidle in /root? Has it got execute permission (probably needs chmod +x /root/sasidle)? What happens if you trype "/root/sasidle --help"?
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
oops, i missed one important step. I used the arg "--pool storage --devices '/dev/da[1-6]'" with your latest scrip and it seems it's running good now:

Code:
Jun 17 21:09:46 freenas sasidle[1408]: /root/sasidle starting
Jun 17 21:09:46 freenas sasidle[1408]: ---------------------------------------------------------
Jun 17 21:09:46 freenas sasidle[1408]: Monitored Pool:    storage
Jun 17 21:09:46 freenas sasidle[1408]: Monitored Devices: da1 da2 da3 da4 da5 da6
Jun 17 21:09:46 freenas sasidle[1408]: Polling Interval:  60 seconds
Jun 17 21:09:46 freenas sasidle[1408]: Idle Timeout:      30 * 60 seconds
Jun 17 21:09:46 freenas sasidle[1408]: ASync Enabled:     Yes
Jun 17 21:09:46 freenas sasidle[1408]: Simulated Stop:    No
Jun 17 21:09:46 freenas sasidle[1408]: Log Disk Start:    Yes
Jun 17 21:09:46 freenas sasidle[1408]: ---------------------------------------------------------



And one more thing, it's not related to sasidle but since i saw your ticket in the support section, so i ask it here. I have 1 pool that the serial is unknown in the web, i tried export then import both from web and CLI but still the same. Not sure if it has any problem later
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
And one more thing, it's not related to sasidle but since i saw your ticket in the support section, so i ask it here. I have 1 pool that the serial is unknown in the web, i tried export then import both from web and CLI but still the same. Not sure if it has any problem later

Ticket #860? Well, exporting then re-importing the pool fixed that particular problem, but that was in 8.0.1 and should have been fixed a while back... although there are a number of disk serial # tickets for later releases, with suggestions of regressions. Which version of FreeNAS are you using exactly? I've no idea what significance the lack of serial numbers will have, but maybe that would go some way to explain why when you move your disks from one controller to the other that the FreeNAS config database is unable to keep track of them...
 

arryo

Dabbler
Joined
May 5, 2012
Messages
42
I'm using 8.0.4-p2. Yeah i'm not sure why it happens , let me check more tickets to see if there's a possible fix. Thank you very much Milhouse for your help
 

TimeBandit

Cadet
Joined
Jun 7, 2012
Messages
9
Quite simply I have something to say on this topic...
Most of us out here just want their drives to simply spin-down after x-minutes (or hours) of in-activity. Rude-awakening is that this issue seems to mystify both software developers and hard drive manufacturers alike - with the finger at manufactures. So this really isn't directed at FreeNAS as they are simply at the mercy of FreeBSD who is at the mercy of Western Digital, Seagate, blah blah blah....

I'm no expert, and still rudely awakening myself, but after reading countless posts and doing some experimentation on my own - the verdict is utter disappointment. For those just getting there feet wet, here's what I know.

1st, a small slam at FreeNAS. The "View Disks --> Edit Disk" screen is confusing because it has "HDD Standby" drop-down field, AND has "Advanced Power Management" Drop-down field. Confusing and not intuitive - just ask the three separate threads in the N00bs section by three different folks with no replies so it seems everyone else is confused, or doesn't care. FWIW, and if I had to guess, these might correspond to the "-S" and "-P" switches of the ataidle CLI utility. If so, then it would be an either/or - not both.

Next, setting my ada0 & ada1 pool drives with the ataidle -S command doesn't work as expected/documented. I can here my SATA drives clacking at the right time, but no spinny down down. It's like it tries, but immediately comes alive before it even tries to go down. So who's fault is this ... FreeBSD's or Western Digital.

Okay, so on to APM since these drives support it (least ataidle reports so). The problem .... APM codes are vendor-specific ... and to rub a little salt in the wound, THERE IS NO DOCUMENTATION. At least I gave up looking. FreeNAS seems to offer: 1, 64, 127, 128, 254, blah blah. But what do these mean, and for what drives. You are better off going to Vegas. On "my" system, 1 is way too aggressive - drives spun down in 8 seconds after every re-access!! Argggg! Ok, let's try the 64 option. Same crap - 8 seconds (yes I rebooted and even queried the drives to be sure - it was set to 64)! Speaking of craps, let's roll the dice one more time... 127 is my last shot because anything over says it surely won't spin-down at all. Well.... at least for me and these drives, it doesn't seem 127 spins down at all! And neither did 96 (yeah - pulled that # out of my butt). So, in summary, it seems like my drives just want 8-seconds, or never at all !!

The last gripe I have is both camcontrol and ataidle issue a spin-down action immediately when just setting the timeout values. So it offers the false assurance the drive will spin down again on it's own - ah, not the case. Many others have echoed the same sentiments. But maybe we can leverage this "feature"...

So what can one do?? My answer, and to agree with previous posters, is we need "daemon-control" - aka, some sort of script or process that baby-sits the drives, thus simply issues (on all drives) the ataidle -s (or camcontrol standby) sleep signal after a period of monitored no-activity. Although crude, the iostat sh script would do just nicely. FreeNAS developers knowing of these auto-spin-down issues should just take matters into their own hands and write a simple yet effective daemon to govern it themselves I'll try to experiment a bit more myself, but I hope this has helped shed some light on a subject that has much more maturity to go.
 
Status
Not open for further replies.
Top