NVMe Autonomous Power State Transition (APST) - Not Working in CORE, Works in SCALE

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
So I'm playing around with APST (I had a slow day thankfully). I noticed when I'm using SCALE, the power states will shift around depending on what the drive (1TB EX900 Plus M.2) needs to perform. Do not get distracted by the fact that I have only one M.2 drive, it is purely for testing. My NVMe has 5 power states (0-4) and will typically remain in power state 4 (0.0090W) for SCALE. It appears that APST is well supported in Debian.

Now for CORE 13. It appears that APST is not functional here, at least by default. The NVMe starts off in power state 0 (3.000W) and remains there all the time.

I am able to use the command nvmecontrol power -p 4 nvme0 to force the NVMe into power state 4, and it will remain there until the drive is asked to do some real work and then it is back to power state 0 and will remain, it will not drop to a lower power state automatically.

In searching the internet for half a day I was unable to find any real mention of APST with FreeBSD or TrueNAS. The few things I could find were very elusive and talked about the AHCI driver could be the issue, but that was more of a discussion about AHCI but it did mention APST.

Here comes the question and I'm sorry I'm even asking it but I've exhausted my brain and the internet searching:
Is there any way to get TrueNAS CORE to make the NVMe APST work properly? I was hoping for a tunable that would work but I have not found that either.

I also understand that I could force the NVMe to allow a lesser maximum power state but that is not what I'm looking for. I do plan to use this feature and to test the nvme speed with various setting. But APST is what I'm after. Maybe the answer is FreeBSD does not support it. If that is it, then I can stop plucking the rest of my hair out and move on.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm a bit lost when it comes to NVMe on FreeBSD, given the new driver and/or model for the driver and its interaction with geom... That is to say, I don't have any answers, but I'll be following this.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I will be researching it, there MUST be something on this topic. FreeBSD 14 is coming out, maybe it will have better nvme support. I know that will not fix the current TrueNAS CORE issue but maybe in the future. I just downloaded FreeBSD 14-RC4 and I will give it a try tomorrow. I'm not holding my breath. It may be a long time before APST is supported well in FreeBSD. But if the new version works, I'm apt to copy the driver to TrueNAS CORE to see if I can patch it to work. This is all an exercise in knowledge at this point in time.

ATPS automatically changes the power state of the NVMe, so it could be running at full speed using 3 Watts of power (or as high at 28 Watts or more), or half speed using 2.7 Watts, or in a non-operational state using 0.0090 Watts (just barely alive). Some NVMe go down to 0.0040 Watts, possibly even lower. This is mainly beneficial for portable computers to shave off a little power.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@Ericloewe Do you have any nvme's on a CORE system? I just completed the bulk of a script to where I like the way it functions. I have two more minor things to clean up and then test extremely well, but the script will only run on CORE (since SCALE ATPS is supported). I should have it complete sometime tomorrow but it's midnight now and I've been up for 21.5 hours, Zzzzzz. I did download FreeBSD 14 but been working a lot so tomorrow I will be able to see if it has APST support, but since I haven't seen anything in the FreeBSD forums or in the documentation, I think it may not be supported. I will find out. Anyway, if you do have a nvme in a CORE system, you can try this script out once I'm satisfied with it.

And I'm not the nvme whisperer. I'm learning as I go as well.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
@joeschmuck I have a couple of CORE systems with NVMe. When I found out about the power states I created a startup task that set the devices to a lower state. I had no idea this thing is dynamic. Also e.g. nvmecontrol power -l nvme0 does not show the current state, only list the available ones.

Have you contacted Warner Losh about this?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@Patrick M. Hausen Yes, nvmecontrol power -l nvme0 does list the power states. And nvmecontrol power nvme0 will list the current power state. As for dynamic, I don't want to say that most nvme drives support it but I think it was part of the NVME 1.3 standard (don't quote me here). It is not supported under FreeBSD 13 but is under Debian. I plan to install FreeBSD 14 today just to see if that version supports it. I am not holding my breath.

The script I wrote does a lot of little things. I will post it here once I'm done and let some folks take it for a test drive. I have two features I still desire to incorporate which will make the script feel like it's paying off, or it could prove it isn't worth the effort, but I think the information it could prove to be valuable to some folks, especially if they have a lot of nvme devices.

EDIT: I only have a single nvme drive so I can't test if it cycles through all the drives or not, someone will have to tell me.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
And nvmecontrol power nvme0 will list the current power state.
:rolleyes: OMG! And nowhere does the usage information tell you that:
Code:
Usage:
    nvmecontrol power <args> controller-id|namespace-id

Manage power states for the drive
Options:
 -l, --list                    - List the valid power states
 -p, --power=<NUM>             - Set the power state
 -w, --workload=<NUM>          - Set the workload

So I thought -l, -p, -w were the only functions available.

But then that puzzles me quite a bit because my Samsung SSDs have these power states:
Code:
root@freenas[~]# nvmecontrol power -l nvme1

Power States Supported: 5

 #   Max pwr  Enter Lat  Exit Lat RT RL WT WL Idle Pwr  Act Pwr Workloadd
--  --------  --------- --------- -- -- -- -- -------- -------- --
 0:  7.8000W    0.000ms   0.000ms  0  0  0  0  0.0000W  0.0000W 0
 1:  6.0000W    0.000ms   0.000ms  1  1  1  1  0.0000W  0.0000W 0
 2:  3.4000W    0.000ms   0.000ms  2  2  2  2  0.0000W  0.0000W 0
 3:  0.0700W*   0.210ms   1.200ms  3  3  3  3  0.0000W  0.0000W 0
 4:  0.0100W*   2.000ms   8.000ms  4  4  4  4  0.0000W  0.0000W 0


And I set them all to state 2 at boot time and this is where they are still at:
Code:
root@freenas[~]# nvmecontrol power nvme0   
Current Power Mode is 2
root@freenas[~]# nvmecontrol power nvme1
Current Power Mode is 2
root@freenas[~]# nvmecontrol power nvme2
Current Power Mode is 2
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
As I understand it if you set a specific power state then the nvme will remain there. If the power state is too low for the workload (defined by the -w option) then the power state will increase (lower number) to satisfy the requirements (it will not return to the previous power state). As I understand it, the power levels with the asterisk are considered inoperable states, meaning the drive is sleeping and will be pulled out if needed. the wake times are listed as well. Minor evens will not for a power state change.

I've spent a few minutes looking into the nvme drive type. I've even updated my TrueNAS (Core and Scale) to smartmontools 7.4 to take advantage of the better nvme support. Of course the average person would not be doing that but it's a very easy process, and of courser it does not survive a TrueNAS update. Same way with 7zip that I have my little script automatically install on SCALE until it is included.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Attached is a little script (CORE Users Only). To use it: There are two switches: '-h' for help (pipe to 'more') due to it's length (2 pages). and '-d' which will delete the log file when the script starts. It has a basic setup now however you may desire to change one or more of the 9 User Definable parameters within the script.

There is no direct email however there is a log file which can be used to analyze if/when your nvme drives are changing states.

If someone runs this, I'd be curious how it works, especially on multiple nvme drives since I only have one myself to test with. I did not spell check anything, and I have one more feature to incorporate likely tomorrow to denote your power savings. If you run it on SCALE, it will just hiss at you and tell you it does not run on SCALE.
 

Attachments

  • nvme_power_state_v0.3_2023_11_10.txt
    13.4 KB · Views: 66

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
So if any state above 2 means the drive is inaccessible and needs to wake up first, I don't see much of a point in setting it that way. I run jails and VMs on my SSD pool - there will be a continuous stream of write operations 24x7 ...
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
So if any state above 2 means the drive is inaccessible and needs to wake up first, I don't see much of a point in setting it that way. I run jails and VMs on my SSD pool - there will be a continuous stream of write operations 24x7 ...
For the drive data you posted, as I understand it, if you placed it in power state 3, it would take 1.2ms before it would be able to provide data (Exit Latency), power state 4 would be 8ms. That is significantly faster than a hard drive spinning up. These "non-operational" states just mean that the nvme drive needs to come out of idle. Notice the latency values for power states 0, 1, and 2, they are all 0ms which I suspect is the reason they are considered operational. Maybe a better way to say it is, "non-operational" = "idle". It does not mean it is not operational.

You said you set your nvme to power state 2 on bootup, what do you think would happen if you set power state 3? Since it is idle I suspect it will migrate to power state 2 and then remain there.

Since you have some SCALE systems, I'd be curious at what power state they are at most of the time. Debian supports APST, the power states should be changing depending on the work load. Mine are almost always in power state 4 because I don't have much going on. When I'm doing a scrub of the nvme pool, it changes to power state 2 and does not go any higher. I think the command on SCALE to read the nvme drive power state is nvme get-feature /dev/nvme0 -f 2 but I'd be certain if I had SCALE running right now, but I have CORE running at this moment.

As for running my little script, as you said, there might not be any gains if you are writing a lot. The smallest granularity of the script is 1 second. It could be modified to 0.1 seconds but would that really be good enough and save much power? That is why I still need to finish the script to measure how long the nvme stays in a certain power state.

Hopefully what I wrote here is clear, I'm not always the best communicator.

I'm looking at purchasing up to six 4TB nvme drives tonight or tomorrow. One will become my boot device for ESXi, the other 5 will become my raidz-2, well that it the current idea. I might change it when I have the drives in hand. And I need to examine my current server and investigate the PCIe lanes. I know what the user manual states but I need to see what I have stuffed in the slots now. I'd rather not buy a new MB, CPU, and RAM this moment, the nvme drives will cost enough.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
If someone runs this, I'd be curious how it works, especially on multiple nvme drives since I only have one myself to test with.
Thanks for this script—and making me aware that CORE lacks power management for NVMe.
So here you are, on a system with an over-provisioned Optane DC P4801X SLOG and a plain 600P boot drive.

Code:
root@NASflash[~]# nvmecontrol devlist
 nvme0: INTEL SSDPEL1K100GA
    nvme0ns1 (23849MB)
 nvme1: INTEL SSDPEKKW128G7
    nvme1ns1 (122104MB)
root@NASflash[~]# nvmecontrol power -l nvme0

Power States Supported: 1

 #   Max pwr  Enter Lat  Exit Lat RT RL WT WL Idle Pwr  Act Pwr Workloadd
--  --------  --------- --------- -- -- -- -- -------- -------- --
 0: 10.0000W    0.000ms   0.000ms  0  0  0  0  0.0000W  0.0000W 0
root@NASflash[~]# nvmecontrol power -l nvme1

Power States Supported: 5

 #   Max pwr  Enter Lat  Exit Lat RT RL WT WL Idle Pwr  Act Pwr Workloadd
--  --------  --------- --------- -- -- -- -- -------- -------- --
 0:  9.0000W    0.005ms   0.005ms  0  0  0  0  0.0000W  0.0000W 0
 1:  4.6000W    0.030ms   0.030ms  1  1  1  1  0.0000W  0.0000W 0
 2:  3.8000W    0.030ms   0.030ms  2  2  2  2  0.0000W  0.0000W 0
 3:  0.0700W*  10.000ms   0.300ms  3  3  3  3  0.0000W  0.0000W 0
 4:  0.0050W*   2.000ms  10.000ms  4  4  4  4  0.0000W  0.0000W 0
root@NASflash[~]# cat /tmp/nvme_power_state_log.txt
Sat Nov 11 09:47:00 CET 2023 Recording to both stdout and /tmp/nvme_power_state_log.txt
List of drives=nvme0 nvme1
Initial power state for each nvme drive:
Drive nvme0 current power state 0
Drive nvme0 lowest power state 0
Initial power state for each nvme drive:
Drive nvme1 current power state 00
Drive nvme1 lowest power state 00
Sat Nov 11 09:49:58 CET 2023 Recording to both stdout and /tmp/nvme_power_state_log.txt
List of drives=nvme0 nvme1
Initial power state for each nvme drive:
Drive nvme0 current power state 0
Drive nvme0 lowest power state 0
Initial power state for each nvme drive:
Drive nvme1 current power state 02
Drive nvme1 lowest power state 04


Boot drive was set manually to level 2. There's not much to do about the Optane SLOG.

Here is another system with another 600P boot drive and consumer Optane 900P L2ARC+SLOG:

Code:
NASblanc# nvmecontrol devlist
 nvme0: INTEL SSDPEKKW128G7
    nvme0ns1 (122104MB)
 nvme1: INTEL SSDPED1D280GA
    nvme1ns1 (267090MB)
NASblanc# nvmecontrol power -l nvme0

Power States Supported: 5

 #   Max pwr  Enter Lat  Exit Lat RT RL WT WL Idle Pwr  Act Pwr Workloadd
--  --------  --------- --------- -- -- -- -- -------- -------- --
 0:  9.0000W    0.005ms   0.005ms  0  0  0  0  0.0000W  0.0000W 0
 1:  4.6000W    0.030ms   0.030ms  1  1  1  1  0.0000W  0.0000W 0
 2:  3.8000W    0.030ms   0.030ms  2  2  2  2  0.0000W  0.0000W 0
 3:  0.0700W*  10.000ms   0.300ms  3  3  3  3  0.0000W  0.0000W 0
 4:  0.0050W*   2.000ms  10.000ms  4  4  4  4  0.0000W  0.0000W 0
NASblanc# nvmecontrol power -l nvme1

Power States Supported: 1

 #   Max pwr  Enter Lat  Exit Lat RT RL WT WL Idle Pwr  Act Pwr Workloadd
--  --------  --------- --------- -- -- -- -- -------- -------- --
 0: 18.0000W    0.000ms   0.000ms  0  0  0  0  0.0000W  0.0000W 0
NASblanc# ./nvme_power_state_v0.3_2023_11_10.sh
Sat Nov 11 10:19:48 CET 2023 Recording to both stdout and /tmp/nvme_power_state_log.txt
List of drives=nvme0 nvme1
Initial power states for each nvme drive:
Drive nvme0 current power state 2
Drive nvme0 lowest power state 4
Initial power states for each nvme drive:
Drive nvme1 current power state 20
Drive nvme1 lowest power state 40
./nvme_power_state_v0.3_2023_11_10.sh: line 256: nvme0 nvme1 : syntax error in expression (error token is "nvme1 ")


The same error at line 256 was encountered on the first system, but it's not logged.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Code:
root@freenas[~]# nvmecontrol power -p 3 nvme1; nvmecontrol power nvme1
Current Power Mode is 2
root@freenas[~]#

No error message, nothing in dmesg output.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
The same error at line 256 was encountered on the first system, but it's not logged.
Thanks for the test. It is difficult to test for more than one drive, since I only have one at the moment. Looks like I will need to 'fake' out the script to simulate multiple nvme drives until more show up on my doorstep. I am not happy that the log file didn't state it was setting the power level, it only reported the current values. Just to be clear when you say the boot drive was set to power state 2, does that mean the script 'minimum_power_state=2' was set? If it remained at '99' then I would have expected the nvme0 to try to change to power state 4.

I see the obvious error, the script saying the nvme1 has power states 20 and 40 which we know is not true. Somewhere I added a zero yet it works on the nvme0 fine. The problem could be in a few different locations. Well time to see if I can recreate the issue.

The optane drive, not much a person could do with it if it only has a single power state.

EDIT: I did not setup to record errors, I may need to see about adding that.
 
Last edited:

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Code:
root@freenas[~]# nvmecontrol power -p 3 nvme1; nvmecontrol power nvme1
Current Power Mode is 2
root@freenas[~]#

No error message, nothing in dmesg output.
I'm not clear on what you are saying here. Are you saying this is the command you send at boot time to try and place your nvme1 into power state 3, but it only sets power state 2? That command works fine on my system and reports mode set is 3. Of course a few seconds later the script changes it back to power state 4.

I'm curious if the reason it did not set the lower power state is due to the activity of the drive. You did say your drives are heavily active. That is just a guess on my part. I also think I read something like that, maybe in the NVMe specifications, that it will not transition into a power power state if the drive needs the current power state. I hopefully will see that again in the next few weeks. Now I'm curious if there is a way to identify drive activity, in an effort to not try to place it into a lower power state if the drive is active. I have lots of questions and very few answers.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
I am not happy that the log file didn't state it was setting the power level, it only reported the current values. Just to be clear when you say the boot drive was set to power state 2, does that mean the script 'minimum_power_state=2' was set?
The script ran unchanged. I actually had set power level manually before running.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I'm not clear on what you are saying here. Are you saying this is the command you send at boot time to try and place your nvme1 into power state 3, but it only sets power state 2? That command works fine on my system and reports mode set is 3. Of course a few seconds later the script changes it back to power state 4.
I set the power state to 2 at boot time and it seems to stay that way. When I set the state to 3 with the system up and running it immediately gets set back to 2 as you can (IMHO) deduce from that command and the output I posted. My argument is probably "does it make sense to set it higher than 2 at all for an SSD that is continuously used?"


Now when I run your script I get a syntax error:
Code:
root@freenas[/mnt/hdd/scripts]# ./nvme_power_state_v0.3_2023_11_10.sh   
Sat Nov 11 17:40:59 CET 2023 Recording to both stdout and /tmp/nvme_power_state_log.txt
List of drives=nvme0 nvme1 nvme2
Initial power states for each nvme drive:
Drive nvme0 current power state 2
Drive nvme0 lowest power state 4
Initial power states for each nvme drive:
Drive nvme1 current power state 22
Drive nvme1 lowest power state 44
Initial power states for each nvme drive:
Drive nvme2 current power state 222
Drive nvme2 lowest power state 444
./nvme_power_state_v0.3_2023_11_10.sh: line 256: nvme0 nvme1 nvme2 : syntax error in expression (error token is "nvme1 nvme2 ")


Here's the same with more debug info:
Code:
root@freenas[/mnt/hdd/scripts]# bash -x nvme_power_state_v0.3_2023_11_10.sh
+ LANG=en_US.UTF-8
+ declare -a nvme_array
+ declare -a nvme_current_power_state
+ declare -a nvme_lowest_power_state
+ declare -i minimum_power_state
+ declare -a nvme_old_power_state
+ declare -i delay_seconds
+ declare -a nvme_maximum_current
+ declare -a nvme_total_current
+ declare -a nvme_total_time
+ minimum_power_state=99
+ check_periodicity=3
+ delay_seconds=5
+ exit_after=82800
+ output_to_stdout=true
+ output_to_file=true
+ output_on_change_only=true
+ max_output_file_size=100
+ output_file=/tmp/nvme_power_state_log.txt
+ SECONDS=0
++ uname -s
+ softver=FreeBSD
+ [[ FreeBSD != \F\r\e\e\B\S\D ]]
+ [[ '' == \-\d ]]
+ [[ '' == \-\h ]]
+ [[ '' != \-\d ]]
+ [[ '' != '' ]]
+ [[ true == \t\r\u\e ]]
+ [[ true == \t\r\u\e ]]
++ date
+ echo Sat Nov 11 17:43:01 CET '2023 Recording to both stdout and /tmp/nvme_power_state_log.txt'
Sat Nov 11 17:43:01 CET 2023 Recording to both stdout and /tmp/nvme_power_state_log.txt
++ date
+ echo Sat Nov 11 17:43:01 CET '2023 Recording to both stdout and /tmp/nvme_power_state_log.txt'
+ [[ true == \t\r\u\e ]]
+ [[ true != \t\r\u\e ]]
+ [[ true != \t\r\u\e ]]
+ purge_output_file
++ wc -l
+ line_count='      67'
+ lines_to_delete=-33
+ [[ -33 -gt 10 ]]
+ get_smartNVM_listings
++ awk '{for (i=NF; i!=0 ; i--) print $i }'
+++ sysctl -n kern.disks
++ tr ' ' '\n'
++ sort
++ tr '\n' ' '
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/ada5
+++ grep NVM
++ '[' '' ']'
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/ada4
+++ grep NVM
++ '[' '' ']'
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/ada3
+++ grep NVM
++ '[' '' ']'
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/ada2
+++ grep NVM
++ '[' '' ']'
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/ada1
+++ grep NVM
++ '[' '' ']'
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/ada0
+++ grep NVM
++ '[' '' ']'
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/nvd2
+++ grep NVM
++ '[' '/dev/nvd2: To monitor NVMe disks use /dev/nvme* device names' ']'
++ printf '%s ' nvd2
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/nvd1
+++ grep NVM
++ '[' '/dev/nvd1: To monitor NVMe disks use /dev/nvme* device names' ']'
++ printf '%s ' nvd1
++ for drive in $(sysctl -n kern.disks)
+++ smartctl -i /dev/nvd0
+++ grep NVM
++ '[' '/dev/nvd0: To monitor NVMe disks use /dev/nvme* device names' ']'
++ printf '%s ' nvd0
+ smartdrivesNVM='nvd0 nvd1 nvd2 '
++ echo 'nvd0 nvd1 nvd2 '
++ sed s/nvd/nvme/g
+ smartdrivesNVM='nvme0 nvme1 nvme2 '
+ for smartdrivesnvme in $smartdrivesNVM
+++ echo nvme0
+++ sed -r 's#^nvme##'
+++ cut -d n -f 1
++ echo 'nvme0 '
+ nvme_drive='nvme0 '
++ echo nvme0
++ sed -r 's#^nvme##'
++ cut -d n -f 1
+ nvme_number=0
+ nvme_array[$nvme_number]+=0
+ for smartdrivesnvme in $smartdrivesNVM
+++ echo nvme1
+++ sed -r 's#^nvme##'
+++ cut -d n -f 1
++ echo 'nvme1 '
+ nvme_drive='nvme0 nvme1 '
++ echo nvme1
++ sed -r 's#^nvme##'
++ cut -d n -f 1
+ nvme_number=1
+ nvme_array[$nvme_number]+=1
+ for smartdrivesnvme in $smartdrivesNVM
+++ echo nvme2
+++ sed -r 's#^nvme##'
+++ cut -d n -f 1
++ echo 'nvme2 '
+ nvme_drive='nvme0 nvme1 nvme2 '
++ echo nvme2
++ sed -r 's#^nvme##'
++ cut -d n -f 1
+ nvme_number=2
+ nvme_array[$nvme_number]+=2
+ smartdrivesNVM='nvme0 nvme1 nvme2 '
+ [[ nvme0 nvme1 nvme2  != '' ]]
+ sort_list='nvme0 nvme1 nvme2 '
+ sort_drives
++ sort -V
+++ echo nvme0 nvme1 nvme2
++ for i in `echo $sort_list`
++ echo nvme0
++ for i in `echo $sort_list`
++ echo nvme1
++ for i in `echo $sort_list`
++ echo nvme2
+ sort_list='nvme0
nvme1
nvme2'
+ smartdrivesNVM='nvme0
nvme1
nvme2'
+ [[ true == \t\r\u\e ]]
+ echo 'List of drives=nvme0' nvme1 nvme2
List of drives=nvme0 nvme1 nvme2
+ [[ true == \t\r\u\e ]]
+ echo 'List of drives=nvme0' nvme1 nvme2
+ for drive in $smartdrivesNVM
+ [[ true == \t\r\u\e ]]
+ echo 'Initial power states for each nvme drive:'
Initial power states for each nvme drive:
+ [[ true == \t\r\u\e ]]
+ echo 'Initial power state for each nvme drive:'
++ nvmecontrol power -l nvme0
++ tail -1
++ cut -c1-2
+ lowest_power_state=4
++ nvmecontrol power nvme0
++ rev
++ cut -c1-2
++ rev
+ current_power_state=2
++ echo nvme2
++ sed -r 's#^nvme##'
++ cut -d n -f 1
+ nvme_number=2
+ eval 'nvme_lowest_power_state[2]+=4'
++ nvme_lowest_power_state[2]+=4
+ eval 'nvme_current_power_state[2]+=2'
++ nvme_current_power_state[2]+=2
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme0 current power state 2'
Drive nvme0 current power state 2
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme0 lowest power state 4'
Drive nvme0 lowest power state 4
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme0 current power state 2'
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme0 lowest power state 4'
+ for drive in $smartdrivesNVM
+ [[ true == \t\r\u\e ]]
+ echo 'Initial power states for each nvme drive:'
Initial power states for each nvme drive:
+ [[ true == \t\r\u\e ]]
+ echo 'Initial power state for each nvme drive:'
++ nvmecontrol power -l nvme1
++ tail -1
++ cut -c1-2
+ lowest_power_state=4
++ nvmecontrol power nvme1
++ rev
++ cut -c1-2
++ rev
+ current_power_state=2
++ echo nvme2
++ sed -r 's#^nvme##'
++ cut -d n -f 1
+ nvme_number=2
+ eval 'nvme_lowest_power_state[2]+=4'
++ nvme_lowest_power_state[2]+=4
+ eval 'nvme_current_power_state[2]+=2'
++ nvme_current_power_state[2]+=2
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme1 current power state 22'
Drive nvme1 current power state 22
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme1 lowest power state 44'
Drive nvme1 lowest power state 44
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme1 current power state 22'
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme1 lowest power state 44'
+ for drive in $smartdrivesNVM
+ [[ true == \t\r\u\e ]]
+ echo 'Initial power states for each nvme drive:'
Initial power states for each nvme drive:
+ [[ true == \t\r\u\e ]]
+ echo 'Initial power state for each nvme drive:'
++ nvmecontrol power -l nvme2
++ tail -1
++ cut -c1-2
+ lowest_power_state=4
++ nvmecontrol power nvme2
++ rev
++ cut -c1-2
++ rev
+ current_power_state=2
++ echo nvme2
++ sed -r 's#^nvme##'
++ cut -d n -f 1
+ nvme_number=2
+ eval 'nvme_lowest_power_state[2]+=4'
++ nvme_lowest_power_state[2]+=4
+ eval 'nvme_current_power_state[2]+=2'
++ nvme_current_power_state[2]+=2
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme2 current power state 222'
Drive nvme2 current power state 222
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme2 lowest power state 444'
Drive nvme2 lowest power state 444
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme2 current power state 222'
+ [[ true == \t\r\u\e ]]
+ echo 'Drive nvme2 lowest power state 444'
+ (( 1 ))
+ (( 1 ))
+ for drive in $smartdrivesNVM
++ nvmecontrol power nvme0
++ rev
++ cut -c1-2
++ rev
+ nvme_old_power_state[$nvme_number]=2
+ [[ true == \t\r\u\e ]]
+ [[ true != \t\r\u\e ]]
+ [[ true == \t\r\u\e ]]
+ [[ true != \t\r\u\e ]]
++ nvmecontrol power nvme0
++ rev
++ cut -c1-2
++ rev
+ power_state=2
+ [[ 99 != 99 ]]
nvme_power_state_v0.3_2023_11_10.sh: line 256: nvme0 nvme1 nvme2 : syntax error in expression (error token is "nvme1 nvme2 ")


HTH,
Patrick
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@Patrick M. Hausen Thank you for the feedback. I can see an obvious problem, I just have not examined the script yet. I've been buying some computer parts.

@Etorix Thank you as well for your feedback.

Because of these problems I went ahead and just purchased a new computer, it will have six 4TB nvme drives. Normally I would buy just the parts I wanted to upgrade but most of my computer parts are a little old (I've been using the same case and power supply since 2013) and I've been thinking (a.k.a. Drooling) about replacing my aging hard drives with nvme drives.

Black Friday sales are available so I took advantage of them. It was not cheap but it will definitely replace my current ESXi server. Now I need to setup a firewall to any place selling computer parts :tongue:

I will examine the script to see if it is obvious what the problem is. Time to go, the dog is asking for dinner.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I have this script working significantly better now that I had four NVMe drives to test it on. This script is not 100% complete as I need to ensure duplicate sessions cannot run, otherwise it's close. It should be functional in the fact that it will change the NVMe power state to the power state you desire. I also need to fix the time formatting the script ran. I'm thinking about adding total watts used in power state 0 vs the designated power state you want. It's not needed but it provides an idea. But I'm not in a hurry to make these changes, I want to makes sure they work. And the holidays are around the corner so I should have time to work on it more.

The script to works on CORE (FreeBSD) only. SCALE (Debian) will automatically reduce the power state of the NVMe drives.

There are some user definable settings:
Code:
########## USER DEFINABLE VARIABLES ##########
minimum_power_state=99        # The minimum power setting script is allowed to be changed to. 99=Ignore
check_periodicity=3        # 3600 seconds = 1 hour, 900 = 15 minutes.  How often to verify the power state of the nvme's.
delay_seconds=5            # Number of seconds to delay before forcing back to minimum power state.
exit_after=82800        # 82800 seconds = 23 hours.  How long to remain in a loop before exiting.
                # - If run from CRON once every 24 hours, set to exit before the CRON executes again.
run_continuously="false"    # This when 'true' will ignore 'exit_after' and run the script and never exit.
                # - This is good if you run the scrip upon startup and never again.
output_to_stdout="true"        # This will list all actions occurring
output_to_file="true"        # This will output to a file any actions occurring (same as to the stdout)
output_on_change_only="true"    # Output message ONLY on a power status change.
max_output_file_size=100    # Maximum number of lines to retain in the text file before deleting the older entries.
output_file="/tmp/nvme_power_state_log.txt"   # The file to output to. Change to a drive path to retain the data through reboot.
#############################################


1. The goal is to setup a CRON JOB to run once every 24 hours
2. The script will run for 23 hours (82800 seconds) and then exit.
3. The script will check the current power state every 3 seconds 'check_poriodicity'.
4. Should the power state change, the script will sleep for 'delay_seconds' by default is 5 seconds so that any work that took it out of the lower state has time to run.
5. The 'output_file' is a text file that by default will only be 100 lines in size and is located in the /tmp/ directory. I recommend you change the location to your desired location.
6. The output_to_stdout will output to the screen, the output_to_file will enable the output_file to record the changes.
7. The minimum_power_state is 99 for the lowest power state, or set it to the lowest power state you desire as some systems HA systems may experience a slight delay while the NVMe changes from a non-operational state to an operational state. The delay in in milliseconds but if it were Google, we all expect instantaneous information. So you could set this for '2' or '1' as you desire. But '99' will set the lowest state. Did you know that there could be up to 32 power states? Most only have about 4, however 32 is a lot.

Attached is the current version of the script (v0.4).

Sample Output File:
Code:
Sat Dec 2 16:26:49 EST 2023 Recording to both stdout and /tmp/nvme_power_state_log.txt
Sat Dec 2 16:26:57 EST 2023 -> Drive nvme0 detected in power state 1.
        Delaying 5 seconds.
Sat Dec 2 16:27:02 EST 2023 -> Attempting to set power state 4
Sat Dec 2 16:27:02 EST 2023 -> Actual power state set is 4
-------------------------------------
Sat Dec 2 16:27:13 EST 2023 -> Drive nvme0 detected in power state 2.
        Delaying 5 seconds.
Sat Dec 2 16:27:19 EST 2023 -> Attempting to set power state 4
Sat Dec 2 16:27:19 EST 2023 -> Actual power state set is 4
-------------------------------------
Sat Dec 2 16:27:30 EST 2023 -> Drive nvme0 detected in power state 3.
        Delaying 5 seconds.
Sat Dec 2 16:27:35 EST 2023 -> Attempting to set power state 4
Sat Dec 2 16:27:35 EST 2023 -> Actual power state set is 4
-------------------------------------
Sat Dec 2 16:27:46 EST 2023 -> Drive nvme0 detected in power state 0.
        Delaying 5 seconds.
Sat Dec 2 16:27:51 EST 2023 -> Attempting to set power state 4
Sat Dec 2 16:27:51 EST 2023 -> Actual power state set is 4
-------------------------------------

Sample run which lasted 60 seconds and I manually changed the power state of the NVMe drive. You can see the date/time the state change was detected and the date/time it was changed to the minimum state.

You can use this to analyse when your NVMe drives are active and when they are not. It can also save "some" power. I'm not going to tell you that you will save a lot unless your NVMe drives are sucking down 25 watts, but even then the best way to measure is with a power meter as watts at the outlet is what counts.

If you try it and see any operational defects, please let me know. I mainly built this script for myself but do not mind sharing. I want to know when and what pulls the nvme drives out of power state 4.

What happened in the first version? I was trying to use arrays in bash, never done it and while I think I now understand it, it really was not needed, I just wanted to try it out. No arrays in this script.
 

Attachments

  • nvme_power_state_v0.4_2023_12_02.txt
    12.8 KB · Views: 50
Top