Hard drive best practices (preemptively replace due to old age?)

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
Hello,

So like every sunday I get a weekly email report of my Freenas drives SMART status. This week, I realized that the drives are getting really old. Last drive that got replaced must be about a year ago as shown by the SMART status below:

Code:
+------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+
|Device|Serial            |Temp|Power|Start|Spin |ReAlloc|Current|Offline |Seek  |Total     |High  |Command|Last|
|      |Number            |    |On   |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks     |Fly   |Timeout|Test|
|      |                  |    |Hours|Count|Count|       |Sectors|Sectors |      |          |Writes|Count  |Age |
+------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+
|da0 ? |ML0221F306AUSD    | 29 |82774|  280|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|2735|
|da1 ? |S1E1RH1L          | 31 |66031|  102|    0|      0|      0|       0|     4| 567747618|   135|      0|2735|
|da2 ? |5YD9RJXW          | 29 |66514|  122|    0|      0|      0|       0|     0| 730738327|     0|8590065666|2735|
|da3 ? |ML0220F31B18RD    | 31 |66424|  127|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|2735|
|da4 ? |ML4220F318UPDK    | 29 |66643|  145|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|2735|
|da5 ? |PK2234P8JYHX5Y    | 35 |50763|   27|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|   4|
|da6 ? |PK1234P8K1GMLP    | 34 |37135|   22|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|   4|
|da7 ? |WD-WCC7K3KL0CV0   | 30 | 7764|   12|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A| 321|
+------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+


As you can see most of the drives are over 5 years old with a chunk of then almost 8 years old! Normally I'd simply wait for a drive to go bad, swap with a test burned one I keep as cold spare, and resilver, but the drives all being pretty old (except da7) I'm scared several fail during resilvering resulting in a lost pool.

I run RAID-Z3, so technically I can afford to lose 3 drives out of 8, but Murphy's law...

Do you guys simply replace your old drives for new ones when they reach such a advanced life? Of course I have many backups (onsite & offsite) but I cannot back everything up due to lack of space. If I was to lose the pool, I'd lose all my music and movies. Not the end of the world but still....

Looking for some of your experiences!
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
I usually start looking sideways at anything older than 48k hours, but I don't allow spin-down / power save. I've simply never run a drive to 83k hours. They either fail, or get preemptively or opportunistically (black Friday sale...) replaced before then.
 

QonoS

Explorer
Joined
Apr 1, 2021
Messages
87
  1. Check what Blackblaze reported on the same HDD models you use. If a certain model tends to die more often. That would be a candidate for an early replace. See here: https://www.backblaze.com/b2/hard-drive-test-data.html
  2. I would do extended SMART checks and have a look at the attributes of these drives to base any decision on more data.
In my experience old drives are not necessarily more likely to fail. Blackblaze reports also showed that. When HDDs got to a certain age they usually just run and run.
BUT be aware that turning off really old hardware (i mean 5+ years) might also mean that it won't turn on after that. I had that with Switches, HDDs, and other enterprise grade stuff that was well over its estimated lifetime.
 

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
I also have read a lot a while ago about HDD reliability and the folks on the older Freenas forum were discussing in length about this. If I remember, most deaths are either premature (infancy) or from old age... Kinda like us humans ;)

That's why burn test any new drives coming in. And of course backups backups. I'm just amazed I could be so negligent and let drives climb up to almost 90k hours when I have nightmares about losing my data....

I just ordered some WD reds PLUS (the CMR models) and will swap the oldest drives preemptively but really when you think about it, maybe I will end up making things worst by swapping an old but reliable drive for a new one that will last only for a short period of time.... After all mosrt of the old drives in this machine are Hitachi's or HGST's. They are tanks.

Thats why I posted here, to get a real life feedback. @rvassar says he replaces around 48k hours. What about you @QonoS ? What about the others reading this thread?

Stats from studies are nice to understand the overall picture but real life experience is much better IMO.
 

QonoS

Explorer
Joined
Apr 1, 2021
Messages
87
Well I do buy HDDs with warranty in mind and keep them for warranty time minus about 6 months. Then I buy new drives with new warranty, migrate my data and later sell the old HDDs.

So with 5 years warranty a HDD would run max. about 40,000 h in my case.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
After all mosrt of the old drives in this machine are Hitachi's or HGST's. They are tanks.

I miss old HGST's, mechanically they were really tanks. Disclaimer: I was an HGST employee for a short period of time, 5+ years ago...

@rvassar says he replaces around 48k hours.

You misunderstood me. By "looking sideways" I meant I start distrusting them and I start watching their status more closely. Are they making bearing noises, do they have reallocated sectors, post seek errors, do they run hotter than they did at 30k hours, things like that.

My goal is to keep my data safe, and not spend a lot of $$ on drives. So I manage the drive age in my pools to avoid the situation you're now facing, where you have more drives in your pool above a certain age than you can have fail and maintain redundancy. By casually buying and replacing a drive here and there, taking advantage of sales, I spend less money at any single time, and have several older but "burned in" spares sitting in a drawer that I can use in a pinch. I avoid having a pool with uniform drive age, as that sets me up to replace multiple components at once.
 
Last edited:

HarambeLives

Contributor
Joined
Jul 19, 2021
Messages
153
Out of interest, how did you get that chart emailed to you? I would love to do that

With regards to the drives. I don't prefer one brand over another, I just run drives until they die. All drives die, and the more time I get before replacing them, the cheaper they were to purchase

If losing a drive makes you antsy, its my opinion your array and/or backups are not configured correctly. Losing a drive will happen to everyone, and should be expected

I'd keep running those drives until they actually fail
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Hello,

So like every sunday I get a weekly email report of my Freenas drives SMART status. This week, I realized that the drives are getting really old. Last drive that got replaced must be about a year ago as shown by the SMART status below:

Code:
+------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+
|Device|Serial            |Temp|Power|Start|Spin |ReAlloc|Current|Offline |Seek  |Total     |High  |Command|Last|
|      |Number            |    |On   |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks     |Fly   |Timeout|Test|
|      |                  |    |Hours|Count|Count|       |Sectors|Sectors |      |          |Writes|Count  |Age |
+------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+
|da0 ? |ML0221F306AUSD    | 29 |82774|  280|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|2735|
|da1 ? |S1E1RH1L          | 31 |66031|  102|    0|      0|      0|       0|     4| 567747618|   135|      0|2735|
|da2 ? |5YD9RJXW          | 29 |66514|  122|    0|      0|      0|       0|     0| 730738327|     0|8590065666|2735|
|da3 ? |ML0220F31B18RD    | 31 |66424|  127|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|2735|
|da4 ? |ML4220F318UPDK    | 29 |66643|  145|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|2735|
|da5 ? |PK2234P8JYHX5Y    | 35 |50763|   27|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|   4|
|da6 ? |PK1234P8K1GMLP    | 34 |37135|   22|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A|   4|
|da7 ? |WD-WCC7K3KL0CV0   | 30 | 7764|   12|    0|      0|      0|       0|   N/A|       N/A|   N/A|    N/A| 321|
+------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+


As you can see most of the drives are over 5 years old with a chunk of then almost 8 years old! Normally I'd simply wait for a drive to go bad, swap with a test burned one I keep as cold spare, and resilver, but the drives all being pretty old (except da7) I'm scared several fail during resilvering resulting in a lost pool.

I run RAID-Z3, so technically I can afford to lose 3 drives out of 8, but Murphy's law...

Do you guys simply replace your old drives for new ones when they reach such a advanced life? Of course I have many backups (onsite & offsite) but I cannot back everything up due to lack of space. If I was to lose the pool, I'd lose all my music and movies. Not the end of the world but still....

Looking for some of your experiences!
Looks like 4 of your disks haven't been tested in 2,735 days... that's over 7 years!

I urgently recommend you schedule regular SMART tests of your drives. On FreeNAS 11.2-U8 you'll find this on the "Tasks->S.M.A.R.T. Tests" menu.

I run short tests every day at 11:00PM and a long test every Sunday at 2:00AM.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Out of interest, how did you get that chart emailed to you? I would love to do that
@HarambeLives - the smart_report.sh script from my FreeNAS script GitHub repository will email you such a report. You'll need to edit the script to specify your email address, and schedule a cron job to run the script. It's located here:
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
BUT be aware that turning off really old hardware (i mean 5+ years) might also mean that it won't turn on after that. I had that with Switches, HDDs, and other enterprise grade stuff that was well over its estimated lifetime.
In my limited experience, such failures usually point to a bad power supply, either external or internal, usually related to a bad electrolytic capacitor.

Once the electrolyte boils off (look for domed caps), you’re done until the capacitor is replaced. I’ve done this to Apple and Netgear stuff and sometimes modify enclosures for better air flow to reduce the ambient temps inside also.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
What about the others reading this thread?

I replace disks when they fail and keep several burned in spares on hand.

Code:
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
|Device|Serial                  |Temp| Power|Start|Spin |ReAlloc|Current|Offline |Seek  |Total     |High  |    Command|Last|
|      |Number                  |    | On   |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks     |Fly   |    Timeout|Test|
|      |                        |    | Hours|Count|Count|       |Sectors|Sectors |      |          |Writes|    Count  |Age |
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
|da0   |WD-WCC4N0853499         |27  | 58050|   60|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da1   |WD-WCC4N22A7P22         |28  | 58049|   58|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da2   |WD-WCC4N76L09H6         |28  | 58049|   58|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da3   |WD-WCC4N76L0P4U         |27  | 47612|   50|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da4   |WD-WCC4NJTT0ASX         |27  | 51361|   51|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da5   |WD-WCC4N41KS3DY         |27  | 58049|   57|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da6   |WD-WCC4N76L0KD1         |27  | 56833|   57|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da7   |WD-WMC4N0D2NU3F         |27  | 57485|   81|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da8   |WD-WMC4N2212892         |27  | 60149|  170|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da9   |WD-WMC4N0F5JMAE         |28  | 31576|   52|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da10  |WD-WCC7K1AXE2FS         |27  | 11418|   22|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da11  |WD-WCC7K7DDT77L         |27  | 19007|   28|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da12  |WD-WCC4N1NDU179         |27  | 51198|   47|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da13  |WD-WMC4N0381307         |27  | 68886|  115|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da14  |WD-WMC4N0339366         |28  | 68780|  113|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da15  |WD-WMC4N1717149         |28  | 52264|   58|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da16  |WD-WMC4N0531033         |27  | 68767|  111|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da17  |WD-WMC4N0339865         |28  | 68767|  111|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da18  |WD-WMC4N0324087         |28  | 68780|  113|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da19  |WD-WMC4N0477862         |28  | 68780|  113|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da20  |WD-WMC4N0339000         |28  | 68780|  113|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da21  |WD-WCC4NPRDD2LD         |27  | 35376|   30|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da22  |WD-WCC4N0997839         |27  | 45532|  160|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|da23  |WD-WCC4N4VPSUC2         |28  | 12980|   32|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   1|
|ada0  |S21TNXAG619602Z         |28  | 50359|     |     |      0|       |        |   N/A|       N/A|   N/A|        N/A|   1|
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
I also plan to run my drives into the ground. I bought them used so they are unlikely to fail simultaneously.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
Looks like 4 of your disks haven't been tested in 2,735 days... that's over 7 years!
Yea, that stood out like a sore thumb immediately to me too.

I think there has been some good advice here on when you might replace a drive. I think it comes down to personal preference. You have some very long running time drives but they haven't been SMART tested either in a very long time, maybe this will change everything after you runa SMART Test on them. I'm curious when the last time was that you did a SCRUB. If it was left a default settings then you should be fine.
 
Last edited:

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
I suspect the 2735 days without logging a smart test is because there is a read write error in the drive's smart log itself. Which would prompt me to replace the drive as you wouldn't get a warning before it just flat-out dies. @freenas-supero when you have time, could you please post the smart info from one of those 2735 issue drives in code tags?
 

Alecmascot

Guru
Joined
Mar 18, 2014
Messages
1,175
Looks like 4 of your disks haven't been tested in 2,735 days... that's over 7 years!
There's a bug in the test timing when the Power on Hours is greater then 64K
 

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
I suspect the 2735 days without logging a smart test is because there is a read write error in the drive's smart log itself. Which would prompt me to replace the drive as you wouldn't get a warning before it just flat-out dies. @freenas-supero when you have time, could you please post the smart info from one of those 2735 issue drives in code tags?

See screenshots for SMART tests. Of course I have been running SMART tests periodically since this server has been built around 2014. Longs are done monthly on 1st day at 2AM. Shorts are done daily at 10AM. See screenshots. I hope my schedules are good cause its always a bit of a puzzle with CRON....... If anybody spots something wrong with my settings, please let me know! :oops:

For the SCRUBs, they're done roughly twice a month (every 2 weeks/14 days). See screenshot.

Here's the output of smartctl -a for da0

Code:
root@freenas:~ # smartctl -a /dev/da0
smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Deskstar 5K3000
Device Model:     Hitachi HDS5C3020ALA632
Serial Number:    ML0221F306AUSD
LU WWN Device Id: 5 000cca 369c2e2e2
Firmware Version: ML6OA180
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5940 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Aug  8 21:46:40 2021 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (23475) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 392) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   134   134   054    Pre-fail  Offline      -       100
  3 Spin_Up_Time            0x0007   135   135   024    Pre-fail  Always       -       405 (Average 407)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       281
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   148   148   020    Pre-fail  Offline      -       28
  9 Power_On_Hours          0x0012   089   089   000    Old_age   Always       -       83427
10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       280
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1198
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       1198
194 Temperature_Celsius     0x0002   200   200   000    Old_age   Always       -       30 (Min/Max 17/41)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     17584         -
# 2  Extended offline    Completed without error       00%     17470         -
# 3  Short offline       Completed without error       00%     17248         -
# 4  Extended offline    Completed without error       00%     17135         -
# 5  Short offline       Completed without error       00%     16868         -
# 6  Extended offline    Completed without error       00%     16754         -
# 7  Short offline       Completed without error       00%     16532         -
# 8  Extended offline    Completed without error       00%     16419         -
# 9  Short offline       Completed without error       00%     16125         -
#10  Extended offline    Completed without error       00%     16011         -
#11  Short offline       Completed without error       00%     15790         -
#12  Short offline       Completed without error       00%     15496         -
#13  Extended offline    Completed without error       00%     15382         -
#14  Short offline       Completed without error       00%     15160         -
#15  Extended offline    Completed without error       00%     15050         -
#16  Short offline       Completed without error       00%     14756         -
#17  Extended offline    Completed without error       00%     14642         -
#18  Short offline       Completed without error       00%     14421         -
#19  Extended offline    Completed without error       00%     14307         -
#20  Short offline       Completed without error       00%     14085         -
#21  Extended offline    Completed without error       00%     13971         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

Attachments

  • long.png
    long.png
    24.5 KB · Views: 139
  • short.png
    short.png
    24.7 KB · Views: 147
  • scrub.png
    scrub.png
    23.2 KB · Views: 146

rvassar

Guru
Joined
May 2, 2018
Messages
972
Longs are done monthly on 1st day at 2AM. Shorts are done daily at 10AM. See screenshots.
Code:
9 Power_On_Hours 0x0012 089 089 000 Old_age Always - 83427

Code:
                                                   ^^^^^^^^ - 83427 power on hours...

Code:
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     17584         -
# 2  Extended offline    Completed without error       00%     17470         -

Code:
                                                           ^^^^^^^ - Last extended test 17470 hours... 

They're not actually happening. :frown:
 

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
Thats worrisome.... but not sure why. Yesterday I upgraded freenas from 11.1 to 11.3 and just changed some settings in the Smart test tasks. If all goes according to plan, the next scheduled LONG test will be on the 1st of the month.

Question 1: Should I manually run LONG tests to check the drives?

Question 2: Would it be good practice to create 2 or 3 test schedules, so not all drives are LONG tested at the same time? I see the LONG tests as pretty intensive, and with some older drived like that, wouldnt' it be safer to run LONG tests on fewer drives at the time in case one fails ? then resilvering could be done between two tests....

Question 3: The scrubs are done regularly. Never got a data error from them. Is it safe (but not smart...) to assume that because the scrubs happened regularly with no errors that the drives are still "reliable" to a certain extent and still good to use?
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
You can manually run the tests and I'm sure they were running under the cron job just no being recorded. The important question is, if the drive isn't able to record the results of the test due to a bug in the test timing, would a notification of a failed test still be triggered? Since there is no record of the test being completed, pass or fail.

If it were my pool at home, I would run the drives till they died and went offline on their own, but I have several copies of the data I care about.
 
Top