So you're asking what schedule to set for scrubs and SMART testing. Well, here's a few tidbits of info:
1. Scrubs are your "regular" maintenance for zpools. They can take a few minutes to a few days depending on the size of your pool, the performance of your pool, your pool's data storage history, the performance of your system as a whole, and the workload place on your pool during the scrub.
2. SMART tests are internal drive tests. There is no 'criteria' for what is or isn't done on a particular test. No doubt each manufacturer has their own specifications for what a "short test" and a "long test" entails. Generally, short tests take less than 5 minutes and long tests take hours. Long test usually read an entire platter to check for errors and short tests do a very simple and quick test.
3. Don't try to schedule SMART tests at the same time as scrubs. It doesn't end well. Your disk can't do a scrub, a SMART test, and handle regular tasks at the same time very well.
4. SMART tests are non-destructive. So you can run them as often as you want. But, you can only run one test at a time per disk(duh?).
5. SMART test results do not return a final result. If you've setup your FreeNAS box properly it will email you if a SMART test fails. So no email means everything is good.
6. Your average disk will store the last 20 or so test results. So if you do tests at a very high frequency and one test fails it may be removed from the log before you can even examine it closely.
7. You can blindly steal my schedule or come up with your own. Since this is for a home server and I'm the only user I don't worry about any performance penalty since my pool performs more than adequately even during a scrub.
8. Scrubs are pretty hard on disks. So scheduling them at a frequency that makes you comfortable with your pool is important.
9. Do not confused SMART monitoring with SMART testing. One does tests, the other only monitors for the drive to find errors through regular use.
10. If you are running SSDs, these tests are almost pointless. Do them if you want, but they're not really functioning in the capacity that you'd expect. Both short and long tests typically take seconds to complete for most brands of SSD. So it's obvious that a long test doesn't actually read every memory cell looking for errors.
So here's my schedule:
SCRUBS: 1st and 15th of the month at 4am. Theshold is set to 10 days.
SHORT SMART TEST: Every 5th, 12th, 19, and 26th of the month at 3am.
LONG SMART TEST: Every 8th and 22nd at 4am.
SHORT SMART TEST alternate: Some people do every odd or even day and choose to do it at a time where it would never interfere with a scrub or long test. For example, my scrubs and long test are schedule for 4am. So if I do 3am for short test I could theoretically do one every single day and not have any conflict in the schedule since the test takes 2 minutes.
If you look at my schedule, I never schedule anything on or after the 28th. This is because every month has a different number of days. If you try to schedule things on those days they will be skipped some months. So instead of trying to deal with it I simply don't schedule anything then. Yes, this means that between the 26th of one month and the first of the next month I don't really do any tests. But to be frank, if you are expecting things to go horribly wrong because you didn't do a test for 5 days, you've got bigger problems and should reconsider your design.
There is no right or wrong schedule. If you want to do scrubs every single day you can. It's a bit excessive in my opinion. It may also cause premature failure of your disks because of the extra wear and tear.
If you want to see how a test is doing, the appropriate command is something like:
# smartctl -a /dev/da1
The output is quite long, but a few sections are useful:
That is how long a Short, Long, and Conveyance test is estimated to take if the disk is completely idle.
This tells you what tests have completed and when. Lifetime hours can be checked in the previous category. If a test is in progress it may be listed above. Some brands will not actually provide an entry until the test completes or fails.
If your drive fails a test it usually qualifies for an RMA.
No doubt others will provide their configurations.
Good luck and happy storing!
1. Scrubs are your "regular" maintenance for zpools. They can take a few minutes to a few days depending on the size of your pool, the performance of your pool, your pool's data storage history, the performance of your system as a whole, and the workload place on your pool during the scrub.
2. SMART tests are internal drive tests. There is no 'criteria' for what is or isn't done on a particular test. No doubt each manufacturer has their own specifications for what a "short test" and a "long test" entails. Generally, short tests take less than 5 minutes and long tests take hours. Long test usually read an entire platter to check for errors and short tests do a very simple and quick test.
3. Don't try to schedule SMART tests at the same time as scrubs. It doesn't end well. Your disk can't do a scrub, a SMART test, and handle regular tasks at the same time very well.
4. SMART tests are non-destructive. So you can run them as often as you want. But, you can only run one test at a time per disk(duh?).
5. SMART test results do not return a final result. If you've setup your FreeNAS box properly it will email you if a SMART test fails. So no email means everything is good.
6. Your average disk will store the last 20 or so test results. So if you do tests at a very high frequency and one test fails it may be removed from the log before you can even examine it closely.
7. You can blindly steal my schedule or come up with your own. Since this is for a home server and I'm the only user I don't worry about any performance penalty since my pool performs more than adequately even during a scrub.
8. Scrubs are pretty hard on disks. So scheduling them at a frequency that makes you comfortable with your pool is important.
9. Do not confused SMART monitoring with SMART testing. One does tests, the other only monitors for the drive to find errors through regular use.
10. If you are running SSDs, these tests are almost pointless. Do them if you want, but they're not really functioning in the capacity that you'd expect. Both short and long tests typically take seconds to complete for most brands of SSD. So it's obvious that a long test doesn't actually read every memory cell looking for errors.
So here's my schedule:
SCRUBS: 1st and 15th of the month at 4am. Theshold is set to 10 days.
SHORT SMART TEST: Every 5th, 12th, 19, and 26th of the month at 3am.
LONG SMART TEST: Every 8th and 22nd at 4am.
SHORT SMART TEST alternate: Some people do every odd or even day and choose to do it at a time where it would never interfere with a scrub or long test. For example, my scrubs and long test are schedule for 4am. So if I do 3am for short test I could theoretically do one every single day and not have any conflict in the schedule since the test takes 2 minutes.
If you look at my schedule, I never schedule anything on or after the 28th. This is because every month has a different number of days. If you try to schedule things on those days they will be skipped some months. So instead of trying to deal with it I simply don't schedule anything then. Yes, this means that between the 26th of one month and the first of the next month I don't really do any tests. But to be frank, if you are expecting things to go horribly wrong because you didn't do a test for 5 days, you've got bigger problems and should reconsider your design.
There is no right or wrong schedule. If you want to do scrubs every single day you can. It's a bit excessive in my opinion. It may also cause premature failure of your disks because of the extra wear and tear.
If you want to see how a test is doing, the appropriate command is something like:
# smartctl -a /dev/da1
The output is quite long, but a few sections are useful:
Code:
Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 482) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes.
That is how long a Short, Long, and Conveyance test is estimated to take if the disk is completely idle.
Code:
SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error # 1 Short offline Completed without error 00% 32310 - # 2 Extended offline Completed without error 00% 32270 - # 3 Short offline Completed without error 00% 32262 - # 4 Short offline Completed without error 00% 32214 - # 5 Short offline Completed without error 00% 32166 - ...
This tells you what tests have completed and when. Lifetime hours can be checked in the previous category. If a test is in progress it may be listed above. Some brands will not actually provide an entry until the test completes or fails.
If your drive fails a test it usually qualifies for an RMA.
No doubt others will provide their configurations.
Good luck and happy storing!