HDD Stress Test Script

Status
Not open for further replies.

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
Hey Guys,

I'm running @jgreco's stress test script and these are the results so far:

Drives.jpg


Consol.jpg


Couple questions....
1) The script says that the parallel seek-stress test should take about 909 minutes. So far it's been going for almost 3 days. Is that normal? How long does it typically take to complete a pass with this script?

2) The drive speeds are kinda all over the map and ada0 and ada2 seem to be jumping around more than the other disks. Is that normal?

Thanks :)
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
1) The script says that the parallel seek-stress test should take about 909 minutes. So far it's been going for almost 3 days. Is that normal? How long does it typically take to complete a pass with this script?

You might want to see if the dd processes are still running and what's going on. Log in with ssh from another system and take a peek with "ps" and also look at any results files happenin' over in /tmp. The script isn't suuuuuper intelligent about handling bad disks or exceptional conditions if something's wrong. If one of the disks was racking up a large number of errors, that might explain, so also roll back the I/O graphs and see if any disks dropped down.

The parallel seek-stress is, though, I think the test where it failed to include a multiplier in the time estimate for the number of simultaneous tests. So if you roll back the I/O graphs in time and they look as pretty as the ones above, you're golden. It should take around 6 * 909 minutes (5454, or 3.8 days) to finish. You should really tell me how badly I suck and how I ought to go fix that. :smile: Seriously.

2) The drive speeds are kinda all over the map and ada0 and ada2 seem to be jumping around more than the other disks. Is that normal?

Yeah, more or less. Drives are physical devices and can have their own personalities. It could represent two drives that simply have a slightly more difficult time "settling in" on a track after a seek. It's good for you to notice the issue, and then maybe follow up with some SMART stats inspection to see if anything appears out of the ordinary. The tool is absolutely worthless for that sort of problem detection, but then again its purpose is to free up the user from the lower level stuff (which is just such a PITA to do by hand) so that you can go off and get handwring-y about this sort of thing.
 

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
Thanks for the reply :) OK, forgetting a multiplier on the time would definitely explain the problem and would mean that we're still within an acceptable time limit for this job :) Looks like it may be done soon.

When you say log in from another system with SSH, do you just mean getting into the GUI over the network? As is I'm using the IPMI and webGUI from my laptop over the network. I opened the shell up with the webGUI and ran some smartctl commands to check on the drives and none of them are hotter than 31°C and they all pass the SMART tests.

Looks like the dd stuff is still going....
Code:
PID TT STAT TIME COMMAND
3973 v0 Is 0:00.31 python /etc/netcli (python2.7)
4048 v0 I 0:00.00 /usr/bin/su -l root
4049 v0 I 0:00.01 -su (csh)
4263 v0 I 0:00.30 python /etc/netcli (python2.7)
4316 v0 I 0:00.00 /usr/bin/su -l root
4317 v0 I+ 0:00.04 -su (csh)
3974 v1 Is 0:00.01 login [pam] (login)
18572 v1 I+ 0:00.01 -csh (csh)
3975 v2 Is 0:00.01 login [pam] (login)
19051 v2 I 0:00.03 -csh (csh)
24032 v2 I+ 0:00.02 /bin/sh - ./solnet-array-test-v2.sh
53699 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53700 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53701 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53702 v2 DL+ 2:46.41 dd if=/dev/ada0 of=/dev/null bs=1048576
53703 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53704 v2 DL+ 3:21.09 dd if=/dev/ada1 of=/dev/null bs=1048576
53705 v2 DL+ 3:18.78 dd if=/dev/ada2 of=/dev/null bs=1048576
53706 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53707 v2 DL+ 2:59.27 dd if=/dev/ada3 of=/dev/null bs=1048576
53708 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53709 v2 DL+ 3:05.90 dd if=/dev/ada4 of=/dev/null bs=1048576
53710 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53711 v2 DL+ 2:49.82 dd if=/dev/ada5 of=/dev/null bs=1048576
53712 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53714 v2 DL+ 3:04.75 dd if=/dev/ada6 of=/dev/null bs=1048576
53715 v2 DL+ 3:13.90 dd if=/dev/ada7 of=/dev/null bs=1048576
53737 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53738 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53739 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53740 v2 DL+ 2:46.17 dd if=/dev/ada0 of=/dev/null bs=1048576
53741 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53742 v2 DL+ 3:21.26 dd if=/dev/ada1 of=/dev/null bs=1048576
53743 v2 DL+ 3:13.85 dd if=/dev/ada2 of=/dev/null bs=1048576
53744 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53745 v2 DL+ 2:55.16 dd if=/dev/ada3 of=/dev/null bs=1048576
53746 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53747 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53748 v2 DL+ 3:05.91 dd if=/dev/ada4 of=/dev/null bs=1048576
53749 v2 DL+ 2:51.07 dd if=/dev/ada5 of=/dev/null bs=1048576
53750 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53752 v2 DL+ 3:05.12 dd if=/dev/ada6 of=/dev/null bs=1048576
53753 v2 DL+ 2:43.50 dd if=/dev/ada7 of=/dev/null bs=1048576
53778 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53779 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53780 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53781 v2 DL+ 2:16.41 dd if=/dev/ada0 of=/dev/null bs=1048576
53782 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53783 v2 DL+ 3:19.18 dd if=/dev/ada1 of=/dev/null bs=1048576
53784 v2 DL+ 3:04.43 dd if=/dev/ada2 of=/dev/null bs=1048576
53785 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53786 v2 DL+ 2:13.91 dd if=/dev/ada3 of=/dev/null bs=1048576
53787 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53788 v2 DL+ 3:05.26 dd if=/dev/ada4 of=/dev/null bs=1048576
53789 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53790 v2 DL+ 2:22.56 dd if=/dev/ada5 of=/dev/null bs=1048576
53791 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53793 v2 DL+ 2:30.82 dd if=/dev/ada6 of=/dev/null bs=1048576
53794 v2 DL+ 3:14.48 dd if=/dev/ada7 of=/dev/null bs=1048576
53845 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53846 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53847 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53848 v2 DL+ 1:57.74 dd if=/dev/ada0 of=/dev/null bs=1048576
53849 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53850 v2 DL+ 2:01.22 dd if=/dev/ada1 of=/dev/null bs=1048576
53851 v2 DL+ 1:56.67 dd if=/dev/ada2 of=/dev/null bs=1048576
53852 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53853 v2 DL+ 1:58.52 dd if=/dev/ada3 of=/dev/null bs=1048576
53854 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53855 v2 DL+ 1:48.08 dd if=/dev/ada4 of=/dev/null bs=1048576
53856 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53857 v2 DL+ 2:23.24 dd if=/dev/ada5 of=/dev/null bs=1048576
53858 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53860 v2 DL+ 1:50.15 dd if=/dev/ada6 of=/dev/null bs=1048576
53861 v2 DL+ 2:05.94 dd if=/dev/ada7 of=/dev/null bs=1048576
53891 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53892 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53893 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53894 v2 DL+ 1:55.98 dd if=/dev/ada0 of=/dev/null bs=1048576
53895 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53896 v2 DL+ 2:01.18 dd if=/dev/ada1 of=/dev/null bs=1048576
53897 v2 DL+ 1:55.68 dd if=/dev/ada2 of=/dev/null bs=1048576
53898 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53899 v2 DL+ 1:55.96 dd if=/dev/ada3 of=/dev/null bs=1048576
53900 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53901 v2 DL+ 1:47.58 dd if=/dev/ada4 of=/dev/null bs=1048576
53902 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53903 v2 DL+ 2:08.50 dd if=/dev/ada5 of=/dev/null bs=1048576
53904 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53906 v2 DL+ 1:47.20 dd if=/dev/ada6 of=/dev/null bs=1048576
53907 v2 DL+ 2:03.74 dd if=/dev/ada7 of=/dev/null bs=1048576
53935 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53936 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53937 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53938 v2 DL+ 1:35.37 dd if=/dev/ada0 of=/dev/null bs=1048576
53939 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53940 v2 DL+ 1:00.16 dd if=/dev/ada1 of=/dev/null bs=1048576
53941 v2 DL+ 1:07.66 dd if=/dev/ada2 of=/dev/null bs=1048576
53942 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53943 v2 DL+ 1:33.60 dd if=/dev/ada3 of=/dev/null bs=1048576
53944 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53945 v2 DL+ 1:41.31 dd if=/dev/ada4 of=/dev/null bs=1048576
53946 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53947 v2 DL+ 1:12.96 dd if=/dev/ada5 of=/dev/null bs=1048576
53948 v2 I+ 0:00.00 /bin/sh - ./solnet-array-test-v2.sh
53949 v2 DL+ 1:35.27 dd if=/dev/ada6 of=/dev/null bs=1048576
53951 v2 DL+ 1:06.59 dd if=/dev/ada7 of=/dev/null bs=1048576
3976 v3 Is+ 0:00.00 /usr/libexec/getty Pc ttyv3
3977 v4 Is+ 0:00.00 /usr/libexec/getty Pc ttyv4
3978 v5 Is+ 0:00.00 /usr/libexec/getty Pc ttyv5
3979 v6 Is+ 0:00.00 /usr/libexec/getty Pc ttyv6
3980 v7 Is+ 0:00.00 /usr/libexec/getty Pc ttyv7
64118 0 Ss 0:00.01 bash
64779 0 R+ 0:00.00 ps
64780 0 S+ 0:00.00 more

Look about right?

This is the contents of tmp. Looks like there are some results files there.
Code:
.PBI.13716 sat.ada3.pspeed.out
.repl-result sat.ada3.size.out
.smartalert sat.ada3.sspeed.out
.sync_disk_done sat.ada4.err
alert sat.ada4.out
dmisha.txt sat.ada4.pspeed.out
firmware sat.ada4.size.out
freenas_config.md5 sat.ada4.sspeed.out
ixdiagnose_boot.log sat.ada5.err
nginx sat.ada5.out
pbi-repo.rpo sat.ada5.pspeed.out
rc.conf.freenas sat.ada5.size.out
sat.ada0.err sat.ada5.sspeed.out
sat.ada0.out sat.ada6.err
sat.ada0.pspeed.out sat.ada6.out
sat.ada0.size.out sat.ada6.pspeed.out
sat.ada0.sspeed.out sat.ada6.size.out
sat.ada1.err sat.ada6.sspeed.out
sat.ada1.out sat.ada7.err
sat.ada1.pspeed.out sat.ada7.out
sat.ada1.size.out sat.ada7.pspeed.out
sat.ada1.sspeed.out sat.ada7.size.out
sat.ada2.err sat.ada7.sspeed.out
sat.ada2.out sat.average.sspeed.out
sat.ada2.pspeed.out sat.average.time
sat.ada2.size.out sessionidmjd2d1zjvtit7djxr2jqxvvmz13r8egh
sat.ada2.sspeed.out solnet-array-test-v2.sh
sat.ada3.err vi.recover
sat.ada3.out
sat.ada3.pspeed.out   
sat.ada3.size.out
sat.ada3.sspeed.out   
sat.ada4.err   
sat.ada4.out   
sat.ada4.pspeed.out   
sat.ada4.size.out   
sat.ada4.sspeed.out   
sat.ada5.err   
sat.ada5.out   
sat.ada5.pspeed.out   
sat.ada5.size.out   
sat.ada5.sspeed.out   
sat.ada6.err   
sat.ada6.out   
sat.ada6.pspeed.out   
sat.ada6.size.out   
sat.ada6.sspeed.out   
sat.ada7.err   
sat.ada7.out   
sat.ada7.pspeed.out   
sat.ada7.size.out   
sat.ada7.sspeed.out   
sat.average.sspeed.out   
sat.average.time   
sessionidmjd2d1zjvtit7djxr2jqxvvmz13r8egh
solnet-array-test-v2.sh   
vi.recover   

 
Last edited:

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
There may be a spot where a couple drives dropped out while the other ones didn't but I think that may be where one of the tests stopped and another one started.

drives1.jpg
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
There may be a spot where a couple drives dropped out while the other ones didn't but I think that may be where one of the tests stopped and another one started.

Yes, that'd be where the initial read test happened. This all looks fairly reasonable.
 

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
Cool! Well, if all goes according to plan this script should be done by about 9pm tonight. When it's done I'll run some SMART tests :)
 

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
This bloody thing is still going.... o_O:confused:

I took a look at your code but my noob eyes weren't able to find any mistakes. The code for the parallel test looks very similar to the seek-stress test.

How hard would it be to write in a % complete status update as the test is running? Once this pass completes I'll probably stop the script and run some SMART tests so I can test any changes you may like to try ;)
 

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
Alright, it finally finished. For those who may be wondering, this test took 7days and 3 hours to complete a pass on 8 4tb HDDs.

Unfortunately there isn't a good way to export drive reporting statistics from FreeNAS but, basically, for the seek stress read test, the read speed started out at about 100-150mb/s and slowly ramped down to about 15-20mb/s before jumping back up to 100-150mb/s and then the test ends. If you see your drive speeds jump back up the test is almost finished.

Also, I kinda figured the test would just keep running indefinitely but it does stop and return you to the command prompt after completing a pass.
 

droeders

Contributor
Joined
Mar 21, 2016
Messages
179
This bloody thing is still going.... o_O:confused:

I took a look at your code but my noob eyes weren't able to find any mistakes. The code for the parallel test looks very similar to the seek-stress test.

How hard would it be to write in a % complete status update as the test is running? Once this pass completes I'll probably stop the script and run some SMART tests so I can test any changes you may like to try ;)

I don't have access to a FreeBSD system ATM, but on Linux, you can see the progress of a dd command by issuing a USR1 signal to the PID of the dd process. Here's some example output from my Linux machine:

Code:
[root@raven ~]# dd if=/dev/sda of=/dev/zero &
[1] 18868
[root@raven ~]# kill -USR1 18868
[root@raven ~]# 2573833+0 records in
2573832+0 records out
1317801984 bytes (1.3 GB) copied, 10.055 s, 131 MB/s

[root@raven ~]# kill -USR1 18868
4944393+0 records in
4944392+0 records out
2531528704 bytes (2.5 GB) copied, 19.2544 s, 131 MB/s


If the BSDs have the same functionality, you could use this to see the progress of the dd commands. If not, you might be able to install the GNU dd package/port and use this instead of the BSD version.
 

Wallybanger

Contributor
Joined
Apr 17, 2016
Messages
150
I don't have access to a FreeBSD system ATM, but on Linux, you can see the progress of a dd command by issuing a USR1 signal to the PID of the dd process. Here's some example output from my Linux machine:

Code:
[root@raven ~]# dd if=/dev/sda of=/dev/zero &
[1] 18868
[root@raven ~]# kill -USR1 18868
[root@raven ~]# 2573833+0 records in
2573832+0 records out
1317801984 bytes (1.3 GB) copied, 10.055 s, 131 MB/s

[root@raven ~]# kill -USR1 18868
4944393+0 records in
4944392+0 records out
2531528704 bytes (2.5 GB) copied, 19.2544 s, 131 MB/s


If the BSDs have the same functionality, you could use this to see the progress of the dd commands. If not, you might be able to install the GNU dd package/port and use this instead of the BSD version.
Well that sounds useful. I'll give it a shot next time I run the script.
 
Status
Not open for further replies.
Top