Back in the late '90's, I was managing a bunch of large whitebox storage servers. For the largest of these, I had the pleasure of building and deploying a massive storage server, 8 shelves of 9 drives each, Seagate ST173404LW 73GB drives, a whopping 5TB ... (*grin*)
Part of the problem was burning in these systems, and so I devised some shell scripty stuff that the hardware techs could use. I've become convinced that a variation on this would be helpful in the FreeNAS community, so I'm playing with a stripped-down version that does some basic disk read testing (at the time of this writing). It is suitable for testing and burn-in use.
I've included just two main passes, a parallel read pass, and a parallel read pass with multiple accesses per disk. The script will do some rudimentary performance analysis and point out possible issues. It needs more work but here it is anyways. This script is expected to be safe to run on a live pool, even though that's not a good idea for performance testing purposes. As with anything you download onto your machine, you are expected to verify the safety to your own satisfaction. Note that the only things that touch the disks are "dd" and they're all structured as "dd if=/dev/${disk}".
Link to the original version of the script. (Note that Xenforo breaks the ftp: link. Please do the obvious fix and change the http://ftp// bit to ftp:// )
Link to the current new SCALE-compatible version (12/2022)
To run it, download it onto a FreeNAS box and execute it as root. It will give you a simple menu
You probably want to look at the disklist (option 4), then pick your target disks with option 2 and an appropriate pattern. For a Seagate ST4000DM000, you could select "ST4000" for example.
The test will run a variety of things and report status. It takes a while. Be patient. It will never terminate as it is intended as a burn-in aid, but you do want to let it do its thing for at least a pass or two to get an idea of how your system performs.
It is best to do this while the system is not busy and preferably before a pool is up and running. That said, it should be safe to use even on a busy filer. I've picked on a busy filer here to give an example of how this looks. Note that da14 is a spare drive and you'll notice that all the other drives are testing much slower (because they're in use). Also note that my numbers here are in a testing mode that doesn't have the script actually doing the entire disk, real results would look a bit different and take forever.
Part of the problem was burning in these systems, and so I devised some shell scripty stuff that the hardware techs could use. I've become convinced that a variation on this would be helpful in the FreeNAS community, so I'm playing with a stripped-down version that does some basic disk read testing (at the time of this writing). It is suitable for testing and burn-in use.
I've included just two main passes, a parallel read pass, and a parallel read pass with multiple accesses per disk. The script will do some rudimentary performance analysis and point out possible issues. It needs more work but here it is anyways. This script is expected to be safe to run on a live pool, even though that's not a good idea for performance testing purposes. As with anything you download onto your machine, you are expected to verify the safety to your own satisfaction. Note that the only things that touch the disks are "dd" and they're all structured as "dd if=/dev/${disk}".
Link to the original version of the script. (Note that Xenforo breaks the ftp: link. Please do the obvious fix and change the http://ftp// bit to ftp:// )
Link to the current new SCALE-compatible version (12/2022)
To run it, download it onto a FreeNAS box and execute it as root. It will give you a simple menu
Code:
sol.net disk array test v3 1) Use all disks (from camcontrol) 2) Use selected disks (from camcontrol|grep) 3) Specify disks 4) Show camcontrol list Option:
You probably want to look at the disklist (option 4), then pick your target disks with option 2 and an appropriate pattern. For a Seagate ST4000DM000, you could select "ST4000" for example.
The test will run a variety of things and report status. It takes a while. Be patient. It will never terminate as it is intended as a burn-in aid, but you do want to let it do its thing for at least a pass or two to get an idea of how your system performs.
It is best to do this while the system is not busy and preferably before a pool is up and running. That said, it should be safe to use even on a busy filer. I've picked on a busy filer here to give an example of how this looks. Note that da14 is a spare drive and you'll notice that all the other drives are testing much slower (because they're in use). Also note that my numbers here are in a testing mode that doesn't have the script actually doing the entire disk, real results would look a bit different and take forever.
Code:
sol.net disk array test v3 1) Use all disks (from camcontrol) 2) Use selected disks (from camcontrol|grep) 3) Specify disks 4) Show camcontrol list Option: 2 Enter grep match pattern (e.g. ST150176): ST4 Selected disks: da3 da4 da5 da6 da7 da8 da9 da10 da11 da12 da13 da14 <ATA ST4000DM000-1F21 CC52> at scbus3 target 44 lun 0 (da3,pass5) <ATA ST4000DM000-1F21 CC52> at scbus3 target 45 lun 0 (da4,pass6) <ATA ST4000DM000-1F21 CC52> at scbus3 target 46 lun 0 (da5,pass7) <ATA ST4000DM000-1F21 CC51> at scbus3 target 47 lun 0 (da6,pass8) <ATA ST4000DM000-1F21 CC51> at scbus3 target 48 lun 0 (da7,pass9) <ATA ST4000DM000-1F21 CC51> at scbus3 target 49 lun 0 (da8,pass10) <ATA ST4000DM000-1F21 CC52> at scbus3 target 50 lun 0 (da9,pass11) <ATA ST4000DM000-1F21 CC51> at scbus3 target 51 lun 0 (da10,pass12) <ATA ST4000DM000-1F21 CC52> at scbus3 target 52 lun 0 (da11,pass13) <ATA ST4000DM000-1F21 CC52> at scbus3 target 53 lun 0 (da12,pass14) <ATA ST4000DM000-1F21 CC52> at scbus3 target 54 lun 0 (da13,pass15) <ATA ST4000DM000-1F21 CC52> at scbus3 target 55 lun 0 (da14,pass16) Is this correct? (y/N): y Performing initial serial array read (baseline speeds) Tue Oct 21 08:21:23 CDT 2014 Tue Oct 21 08:26:47 CDT 2014 Completed: initial serial array read (baseline speeds) Array's average speed is 97.6883 MB/sec per disk Disk Disk Size MB/sec %ofAvg ------- ---------- ------ ------ da3 3815447MB 98 100 da4 3815447MB 90 92 da5 3815447MB 98 100 da6 3815447MB 97 99 da7 3815447MB 95 97 da8 3815447MB 82 84 --SLOW-- da9 3815447MB 87 89 --SLOW-- da10 3815447MB 84 86 --SLOW-- da11 3815447MB 97 99 da12 3815447MB 92 94 da13 3815447MB 102 104 da14 3815447MB 151 155 ++FAST++ Performing initial parallel array read Tue Oct 21 08:26:47 CDT 2014 The disk da3 appears to be 3815447 MB. Disk is reading at about 74 MB/sec This suggests that this pass may take around 860 minutes Serial Parall % of Disk Disk Size MB/sec MB/sec Serial ------- ---------- ------ ------ ------ da3 3815447MB 98 86 88 --SLOW-- da4 3815447MB 90 74 82 --SLOW-- da5 3815447MB 98 82 84 --SLOW-- da6 3815447MB 97 91 95 da7 3815447MB 95 72 76 --SLOW-- da8 3815447MB 82 80 97 da9 3815447MB 87 84 96 da10 3815447MB 84 111 133 ++FAST++ da11 3815447MB 97 120 124 ++FAST++ da12 3815447MB 92 116 126 ++FAST++ da13 3815447MB 102 123 121 ++FAST++ da14 3815447MB 151 144 95 Awaiting completion: initial parallel array read Tue Oct 21 08:39:32 CDT 2014 Completed: initial parallel array read Disk's average time is 741 seconds per disk Disk Bytes Transferred Seconds %ofAvg ------- ----------------- ------- ------ da3 104857600000 743 100 da4 104857600000 764 103 da5 104857600000 752 101 da6 104857600000 737 99 da7 104857600000 748 101 da8 104857600000 754 102 da9 104857600000 738 100 da10 104857600000 762 103 da11 104857600000 748 101 da12 104857600000 756 102 da13 104857600000 740 100 da14 104857600000 653 88 ++FAST++ Performing initial parallel seek-stress array read Tue Oct 21 08:39:32 CDT 2014 The disk da3 appears to be 3815447 MB. Disk is reading at about 58 MB/sec This suggests that this pass may take around 1093 minutes Serial Parall % of Disk Disk Size MB/sec MB/sec Serial ------- ---------- ------ ------ ------ da3 3815447MB 98 52 53 da4 3815447MB 90 48 53 da5 3815447MB 98 50 51 da6 3815447MB 97 50 52 da7 3815447MB 95 48 50 da8 3815447MB 82 48 59 da9 3815447MB 87 54 62 da10 3815447MB 84 47 56 da11 3815447MB 97 49 50 da12 3815447MB 92 50 55 da13 3815447MB 102 49 48 da14 3815447MB 151 52 34 Awaiting completion: initial parallel seek-stress array read