Now I'm looking forward to jgreco's seek-heavy stress test even more...
jgreco posted his script
here, but that thread doesn't allow discussion.
Here are the results from the first pass (the second pass is currently running), apologies for the novel:
Code:
Selected disks: da2 da4
<HGST HUS724040ALS640 A1C4> at scbus0 target 2 lun 0 (da2,pass2)
<HGST HUS724040ALS640 A1C4> at scbus0 target 8 lun 0 (da4,pass4)
Is this correct? (y/N): y
Performing initial serial array read (baseline speeds)
Tue Oct 28 16:45:19 EDT 2014
Tue Oct 28 16:49:49 EDT 2014
Completed: initial serial array read (baseline speeds)
Array's average speed is 174.075 MB/sec per disk
Disk Disk Size MB/sec %ofAvg
------- ---------- ------ ------
da2 3815447MB 175 101
da4 3815447MB 173 99
Performing initial parallel array read
Tue Oct 28 16:49:49 EDT 2014
The disk da2 appears to be 3815447 MB.
Disk is reading at about 175 MB/sec
This suggests that this pass may take around 363 minutes
Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
da2 3815447MB 175 175 100
da4 3815447MB 173 173 100
Awaiting completion: initial parallel array read
Wed Oct 29 00:29:26 EDT 2014
Completed: initial parallel array read
Disk's average time is 27364 seconds per disk
Disk Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
da2 4000787030016 27152 99
da4 4000787030016 27576 101
Performing initial parallel seek-stress array read
Wed Oct 29 00:29:26 EDT 2014
The disk da2 appears to be 3815447 MB.
Disk is reading at about 136 MB/sec
This suggests that this pass may take around 466 minutes
Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
da2 3815447MB 175 136 77
da4 3815447MB 173 132 77
Awaiting completion: initial parallel seek-stress array read
Thu Oct 30 17:21:25 EDT 2014
Completed: initial parallel seek-stress array read
Disk's average time is 105024 seconds per disk
Disk Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
da2 4000787030016 109143 104
da4 4000787030016 100905 96
Performing pass 2 parallel array read
Thu Oct 30 17:21:25 EDT 2014
The disk da2 appears to be 3815447 MB.
Disk is reading at about 175 MB/sec
This suggests that this pass may take around 363 minutes
Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
da2 3815447MB 175 175 100
da4 3815447MB 173 173 100
Awaiting completion: pass 2 parallel array read
I find it interesting that da2 is slightly faster at streaming (175MB/s vs. 173, ~1%), but quite a bit slower at seeking (109143 sec vs. 100905, ~8%!). Does that seem vaguely sane?
I'll be interested to see if the seek stress test disparity was a fluke once the second pass completes.
The only particular problems I've found with the script so far are:
1) As posted, it doesn't loop. I've tweaked that in my copy so that it'll just keep running indefinitely.
2) The initial sampling and time estimate is WAY off. It's not so bad for the streaming (456 minutes instead of 363, or 25%), but for the seek stress test something doesn't seem right. (1800+ minutes instead of 466 minutes, nearly 400% off!)
When I first ran the script, I accidentally had left the "sysctl kern.geom.debugflags=0x10" on from running badblocks, and the estimate was quite a bit closer (1121 minutes):
Code:
Disk Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
da2 4000787030016 27154 99
da4 4000787030016 27578 101
Performing initial parallel seek-stress array read
Tue Oct 28 15:38:51 EDT 2014
The disk da2 appears to be 3815447 MB.
Disk is reading at about 57 MB/sec
This suggests that this pass may take around 1121 minutes
Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
da2 3815447MB 175 138 79
da4 3815447MB 173 133 77
One interesting thing I have noticed is that throughput seems to slowly drop towards the end of the disk. Even with badblocks, I got nearly 175MB/s at the start, but then 100MB/s at the end, consistently for each read and write pass, across both drives. That seems like it would account for the variance in streaming estimate vs. throughput (the average of 175 and 100 is 137.5, which is 25% slower than the initial estimate of 175).
But I'm not sure why the stress test estimate is so wildly off. The disk usage charts show about 150MB/s, presumably aggregated across the 5 processes per drive, and there's one odd spike up to 400MB/s that returns back to sanity over the course of about 6 hours. Maybe ARC is interfering? The two drives are a mirrored vdev, but they're not in use (no scrubbing or anything else).
On the plus side, no kernel warnings about connectivity, and no errors in the SAS PHY layer.