SMART error

Status
Not open for further replies.

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
A little help please,

I just got this error message and am not sure what the reports are telling me. Here's the errors that I was emailed;
Code:
SMART error (CurrentPendingSector) detected on host
The following warning/error was logged by the smartd daemon:
Device: /dev/da3 [SAT], 8 Currently unreadable (pending) sectors

Code:
SMART error (OfflineUncorrectableSector) detected on host
The following warning/error was logged by the smartd daemon:
Device: /dev/da3 [SAT], 8 Offline uncorrectable sectors


So I SSH into my NAS and do a zpool status;
Code:
[root@trinity] ~# zpool status
  pool: TRINITY_RAID-01
state: ONLINE
  scan: scrub repaired 0 in 9h12m with 0 errors on Thu May  1 11:12:10 2014
config:
 
        NAME                                            STATE    READ WRITE CKSUM
        TRINITY_RAID-01                                ONLINE      0    0    0
          raidz2-0                                      ONLINE      0    0    0
            gptid/934f3544-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/93cb8c02-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/944e85b1-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/94c96165-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/95474b4f-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/95c35cfe-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/9644ecaf-e66c-11e2-aea7-002590ab7843  ONLINE      0    0    0
        spares
          gptid/96e8da5f-e66c-11e2-aea7-002590ab7843    AVAIL
 
errors: No known data errors
 
  pool: TRINITY_RAID-02
state: ONLINE
  scan: scrub repaired 0 in 48h29m with 0 errors on Sun May  4 02:29:30 2014
config:
 
        NAME                                            STATE    READ WRITE CKSUM
        TRINITY_RAID-02                                ONLINE      0    0    0
          raidz2-0                                      ONLINE      0    0    0
            gptid/78a36f95-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/791e1c0e-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/79a1317d-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/7a25ab00-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/d2369baa-fbca-11e2-9a7e-002590ab7843  ONLINE      0    0    0
            gptid/7b33f2c1-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/7c377b71-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
        spares
          gptid/7cdcdaeb-e66d-11e2-aea7-002590ab7843    AVAIL
 
errors: No known data errors
 
  pool: TRINITY_RAID-03
state: ONLINE
  scan: scrub repaired 0 in 2h40m with 0 errors on Thu Apr  3 04:40:03 2014
config:
 
        NAME                                            STATE    READ WRITE CKSUM
        TRINITY_RAID-03                                ONLINE      0    0    0
          raidz2-0                                      ONLINE      0    0    0
            gptid/fa342357-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/fab1cd5a-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/fb33787b-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/fbbd4176-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/fc427d2b-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/fcc87b9c-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
            gptid/fd4eb108-e66d-11e2-aea7-002590ab7843  ONLINE      0    0    0
        spares
          gptid/fdf85527-e66d-11e2-aea7-002590ab7843    AVAIL
 
errors: No known data errors


Everything is looking good so far, but I still need to check the drive in question, so I run smartctl -q noserial -a /dev/da3;
Code:
smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.3-RELEASE-p7 amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
=== START OF INFORMATION SECTION ===
Model Family:    Seagate Barracuda (SATA 3Gb/s, 4K Sectors)
Device Model:    ST3000DM001-1CH166
Firmware Version: CC24
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:    512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Thu May  8 22:32:02 2014 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
 
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  584) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  1) minutes.
Extended self-test routine
recommended polling time:        ( 330) minutes.
Conveyance self-test routine
recommended polling time:        (  2) minutes.
SCT capabilities:              (0x3085) SCT Status supported.
 
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000f  119  099  006    Pre-fail  Always      -      222417224
  3 Spin_Up_Time            0x0003  094  094  000    Pre-fail  Always      -      0
  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      86
  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000f  074  060  030    Pre-fail  Always      -      27743685
  9 Power_On_Hours          0x0032  093  093  000    Old_age  Always      -      6633
10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0
12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      86
183 Runtime_Bad_Block      0x0032  099  099  000    Old_age  Always      -      1
184 End-to-End_Error        0x0032  100  100  099    Old_age  Always      -      0
187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0
188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0
189 High_Fly_Writes        0x003a  099  099  000    Old_age  Always      -      1
190 Airflow_Temperature_Cel 0x0022  068  061  045    Old_age  Always      -      32 (Min/Max 24/33)
191 G-Sense_Error_Rate      0x0032  100  100  000    Old_age  Always      -      0
192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      85
193 Load_Cycle_Count        0x0032  057  057  000    Old_age  Always      -      87402
194 Temperature_Celsius    0x0022  032  040  000    Old_age  Always      -      32 (0 23 0 0 0)
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      8
198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      8
199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0
240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      131477538868441
241 Total_LBAs_Written      0x0000  100  253  000    Old_age  Offline      -      42983460975
242 Total_LBAs_Read        0x0000  100  253  000    Old_age  Offline      -      152411166233
 
SMART Error Log Version: 1
No Errors Logged
 
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error      00%      6618        -
# 2  Short offline      Completed without error      00%      6515        -
# 3  Short offline      Completed without error      00%      6404        -
# 4  Short offline      Completed without error      00%      6286        -
# 5  Extended offline    Completed without error      00%      6268        -
# 6  Short offline      Completed without error      00%      6167        -
# 7  Short offline      Completed without error      00%      6071        -
# 8  Extended offline    Completed without error      00%      6053        -
# 9  Short offline      Completed without error      00%      5951        -
#10  Short offline      Completed without error      00%      5783        -
#11  Short offline      Completed without error      00%      5668        -
#12  Extended offline    Completed without error      00%      5650        -
#13  Short offline      Completed without error      00%      5595        -
#14  Short offline      Completed without error      00%      5475        -
#15  Short offline      Completed without error      00%      5381        -
#16  Extended offline    Interrupted (host reset)      00%      5358        -
#17  Short offline      Completed without error      00%      5261        -
#18  Short offline      Completed without error      00%      5165        -
#19  Short offline      Completed without error      00%      5045        -
#20  Extended offline    Completed without error      00%      5027        -
#21  Short offline      Completed without error      00%      4821        -
 
SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


So, should I be shutting down the server, find the drive in question and change it out?
 

Yatti420

Wizard
Joined
Aug 12, 2012
Messages
1,437
Maybe run some smart tests on the drives etc.. Maybe a bad sector that was cleared or something?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
An Extended Self-test might be in order. The raw seek/read error rate numbers are scary (but they cold just be a stupid encoding - it wouldn't be the first time)
 

crisman

Explorer
Joined
Feb 8, 2012
Messages
97
The LLC errors are also high:

(193 Load_Cycle_Count 0x0032 057 057 000 Old_age Always - 87402)

How old is this drive?
Do you run it 24h a day?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The LLC errors are also high:

(193 Load_Cycle_Count 0x0032 057 057 000 Old_age Always - 87402)

How old is this drive?
Do you run it 24h a day?

Given S.M.A.R.T. lists 86 power cycles, he's running it with an average of some ~4 days between reboots.

High, but not troubling. WDs are rated for 250k cycles, IIRC. I assume Seagates are similar.

...Unless there's an expectation the drives will last longer than two years. At this rate, they'll be past their design life around the two-year mark.

(Note: I'm using the age at last S.M.A.R.T. test, since the up-to-date value shows as zero for some reason)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
In your output for smartctl the value for ID 197, that is your problem. The count may not be high yet but it could go crazy soon. Replace that hard drive. Don't let the other postings about the LLC count distract you from the real problem. IDs 5, 197, and 198 are the values to indicate a failure coming soon and they should all be zero.
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
First off, thanks to all for your responses. They are highly appreciated.

crisman,

The whole server is a little under 1 year old, including all of the drives. And I do typically run it 24/7, but sometimes I shut it down (like when I am away for more than 1 day), or when the temperatures of the drives start reporting that they are at the threshold of 35 degrees C.

joeschmuck,

Just to confirm what you are saying, ID's 5, 197 and 198;
Code:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      8
198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      8


Are these the lines you were referring to? If so, what is a good number for line 197 (which, if I'm reading it correctly is showing a value of 8)?

And lastly for all of you, what are some choices of good quality drives (that aren't way too expensive)? I'm going to need to swap all of them out within a year according to the expected life of them from the comments. So I might as well start buying them now, to make sure that I don't get a bunch from the same batch.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Yes, those are the lines I was referring to and as I indicated in my posting above, a value of 0 (zero) it what you want to see, anything above zero is a bad thing for these three specific ID's. You could try to live with this because you do have a RAIDZ2 and can handle a drive failure without issue. If you wanted to you could also run a scrub on the pool with that drive in it for peace of mind but I don't think you really need to do it, but I know I would myself just for the hell of it.

You know what I really like is the fact that FreeNAS sent you emails stating exactly what the problem was, very nice feature to be honest.

As for drive recommendations, well you know you see a lot of WD Red drives here and of course I'll stand by mine but Seagate also has NAS drives as well. I think it depends on what you use your NAS for. Do you need a 7200 RPM drive? You can save some noise/power/and typically cost with a slower drive, and you won't be sacrificing speed, not with this system. I wouldn't purchase replacement drives yet as you will start using up your warranty and you do have a RAIDZ2 so if you have a single failure you still have time to buy and install a new drive. If you are in a work environment then yes, you should have at least one drive on hand at all times because the cost is minimal compared to loss of work product. Personally my warranty for some of my drives expire in Oct 2015 and I plan to purchase one replacement drive only after I have a failure, but I can afford to turn off my NAS if I want to, it's not critical in my house.

Isn't your drive still under warranty? I'd place your spare online and send the failed drive back for RMA.

Sorry for being long winded.
 

ser_rhaegar

Patron
Joined
Feb 2, 2014
Messages
358
An Extended Self-test might be in order. The raw seek/read error rate numbers are scary (but they cold just be a stupid encoding - it wouldn't be the first time)
You are correct, it is the encoding for those two attributes. Seagate encodes certain variables differently than other manufacturer's.
http://www.users.on.net/~fzabkar/HDD/Seagate_SER_RRER_HEC.html

Just to be clear for the OP, this is not in dispute with what joeschmuck already posted. 5, 197 and 198 should be 0.
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
Thanks for clarifying. I ran the same self test on several of my other drives (da0 through to da9), and they all have zero for the 5, 197 and 198 lines. So that confirms what you said about the da3 drive being wonky. I'll replace it this weekend. But you are right too, I should scrub the volume before. Just in case. I'll start that shortly....but right now I'm encoding video files.

I've got spares, 4 I think, sitting waiting for anything like this to happen (plus each pool has a hot spare so that I don't have to swap out a drive bay right away). But if the life expectancy is only 2 years, I'll start buying drives now for the full replacement. And WD Red drives might give me more life. I'll just start buying them when they're at a good sale price.

I have 2 other NAS's waiting to be upgraded, so if I start buying drives now, they won't go to waste. They are much smaller capacity drives (in the other NAS's), and I have been waiting for quite some time to upgrade them (memory prices have been way too high, and I've been avoiding buying 6 more M1015 raid cards). I don't run them all at the same time as they make too much heat, and I currently have no good way to get rid of it all. I've tried using large fans to get the heat out of the room the servers are in, but it just moves it around a bit. Doesn't bring in enough cool air. Sorry, my brain is just all over the place. I'll wrap this up here before I go mental.

Thanks for the info about how Seagate encodes attributes. That's handy to know.

No question, having the NAS email me an error report like that is extremely handy. I thought the error was because I had 4 computers ripping DVD's/BRD's to the NAS all at the same time, plus 2 other computers encoding some of those rips to MKV's. Thanks for your help in diagnosing.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You don't need to do a scrub, you could just replace the drive (per procedure) and let it resilver, I would have done a scrub if I had to wait for a new drive not if I already had one. But no harm done however you might see another email and the error counts may go up. Again, no harm done, just wait for the scrub to finish before replacing the drive or you could stop it if you like.

Good Luck.
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
Ok, I didn't bother with scrubbing the volume as I didn't need to to set the drive offline. It's resilvering as I type this. ~30.5 hours to completion.

I'll take the drive in to my vendor to see if it's still under warranty.

I would have liked to have run this new drive through some checking before putting it into the server, but that would have taken up to a month.

Thanks for all the help. This thread is RESOLVED.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I've got spares, 4 I think, sitting waiting for anything like this to happen (plus each pool has a hot spare so that I don't have to swap out a drive bay right away). But if the life expectancy is only 2 years, I'll start buying drives now for the full replacement. And WD Red drives might give me more life. I'll just start buying them when they're at a good sale price.

It might be worth it to check if there's an option to disable/reduce head parking, since that's the only serious wear attribute in your drives (except for the failed one). If not, I suggest something that has that option (The Reds for instance).

I have 2 other NAS's waiting to be upgraded, so if I start buying drives now, they won't go to waste. They are much smaller capacity drives (in the other NAS's), and I have been waiting for quite some time to upgrade them (memory prices have been way too high, and I've been avoiding buying 6 more M1015 raid cards).

Instead of using many M1015s in one server, you can also use an SAS expander (like the Intel RES2SV240) - it's cheaper once you start talking about three M1015s in many cases.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I would not worry about the LLC (head parking) on the drives you have, the count isn't bad for the drive you have and how long it's been in use. Now if you do purchase a WD drive, checking the timer setting I feel is important because an 8 second timer would toss the LLC through the roof, but this is vendor and drive model specific.
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
Ummm, I don't think I'm finished unfortunately.

This morning, I got my typical emails from my NAS. I was a bit concerned when I saw this;

Code:
Removing stale files from /var/preserve:
 
Cleaning out old system announcements:
 
Backup passwd and group files:
no /var/backups/master.passwd.bak
no /var/backups/group.bak
 
Verifying group file syntax:
/etc/group is fine
 
Backing up package db directory:
 
Disk status:
Filesystem            Size    Used  Avail Capacity  Mounted on
/dev/ufs/FreeNASs1a    926M    381M    470M    45%    /
devfs                  1.0k    1.0k      0B  100%    /dev
/dev/md0              4.6M    3.3M    964k    78%    /etc
/dev/md1              823k    3.0k    755k    0%    /mnt
/dev/md2              149M    17M    119M    13%    /var
/dev/ufs/FreeNASs4      19M    2.0M    16M    11%    /data
TRINITY_RAID-01        12T    12T    218G    98%    /mnt/TRINITY_RAID-01
TRINITY_RAID-02        12T    11T    569G    96%    /mnt/TRINITY_RAID-02
TRINITY_RAID-03        12T    2.6T    9.9T    21%    /mnt/TRINITY_RAID-03
 
Last dump(s) done (Dump '>' file systems):
 
Checking status of zfs pools:
NAME              SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
TRINITY_RAID-01    19T  18.4T  632G    96%  1.00x  DEGRADED  /mnt
TRINITY_RAID-02    19T  17.9T  1.13T    94%  1.00x  ONLINE  /mnt
TRINITY_RAID-03    19T  3.88T  15.1T    20%  1.00x  ONLINE  /mnt
 
all pools are healthy
 
Checking status of ATA raid partitions:
 
Checking status of gmirror(8) devices:
 
Checking status of graid3(8) devices:
 
Checking status of gstripe(8) devices:
 
Network interface status:
Name    Mtu Network      Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll
em0    1500 <Link#1>      00:25:90:ab:78:43  241262    0    0  194566    0    0
em0    1500 192.168.150.0 192.168.150.117    224575    -    -  452496    -    -
usbus    0 <Link#2>                              0    0    0        0    0    0
em1*  1500 <Link#3>      00:25:90:ab:78:42        0    0    0        0    0    0
usbus    0 <Link#4>                              0    0    0        0    0    0
lo0  16384 <Link#5>                            4072    0    0    4072    0    0
lo0  16384 fe80::1%lo0  fe80::1                  0    -    -        0    -    -
lo0  16384 localhost    ::1                      0    -    -        0    -    -
lo0  16384 your-net      localhost            4072    -    -    4072    -    -
 
Security check:
    (output mailed separately)
 
Checking status of 3ware RAID controllers:
Alarms (most recent first):
+++ /var/log/3ware_raid_alarms.today    2014-05-11 03:01:01.000000000 -0600
@@ -0,0 +1 @@
+
 
-- End of daily output --


It was saying that the volume that I just resilvered was still degraded!!! So I logged into the GUI and this is what it was showing me;
RAID-01.gif
What concerns me is that the drive that I resilvered should be showing as part of the pool, but it isn't. Also what is bothering me is that the drive below the replaced drive has the option to detach whereas all the other drives have the option to replace. So I'm hoping that drive isn't getting ready to crater too.

What do I do? Detach the bad drive again? Let it resilver again? Try a different drive?
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
Here's the zpool status;
Code:
  pool: TRINITY_RAID-01
state: DEGRADED
  scan: resilvered 2.52T in 17h23m with 0 errors on Sat May 10 21:06:06 2014
config:
 
        NAME                                              STATE    READ WRITE CKSUM
        TRINITY_RAID-01                                  DEGRADED    0    0    0
          raidz2-0                                        DEGRADED    0    0    0
            gptid/934f3544-e66c-11e2-aea7-002590ab7843    ONLINE      0    0    0
            gptid/93cb8c02-e66c-11e2-aea7-002590ab7843    ONLINE      0    0    0
            gptid/944e85b1-e66c-11e2-aea7-002590ab7843    ONLINE      0    0    0
            replacing-3                                  OFFLINE      0    0    0
              8946676396695516782                        OFFLINE      0    0    0  was /dev/gptid/94c96165-e66c-11e2-aea7-002590ab7843
              gptid/59c9cd9f-d827-11e3-b3b8-002590ab7843  ONLINE      0    0    0
            gptid/95474b4f-e66c-11e2-aea7-002590ab7843    ONLINE      0    0    0
            gptid/95c35cfe-e66c-11e2-aea7-002590ab7843    ONLINE      0    0    0
            gptid/9644ecaf-e66c-11e2-aea7-002590ab7843    ONLINE      0    0    0
        spares
          gptid/96e8da5f-e66c-11e2-aea7-002590ab7843      AVAIL
 
errors: No known data errors
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Did you follow the instructions for replacing a drive? Looks like you need to take the drive that says it's OFFLINE and "Detach" it. Also if you ever rebuild your pool, just a suggestion, why not take the spare and just create a RAIDZ3.

EDIT: You didn't leave the old drive installed, did you?
 

tstorzuk

Explorer
Joined
Jun 13, 2011
Messages
92
Joe,

I'm not 100% sure, but the Offline drive showing in the GUI is the old drive (all my other pools have 7 drives, while this one shows 8)? I just need to Detach it and the pool should be back to operational (nope, it's not in the NAS. I physically pulled it to see if I had warranty on it)?

I did follow the instructions for replacing the drive. But like I stated above about the Offline drive might be the old drive still showing in the GUI, this is from the Replacing a Failed Drive instructions;
Code:
4. If the replaced disk continues to be listed after resilvering is complete, click its entry and use the Detach button to remove the disk from the list.


I'm scrubbing the volume right now just to make something else isn't going on. It's going to take 10+ hours to complete though.

To rebuild the pool, do I need to dump all of the information on the existing pool before creating a RAIDZ3? I've got pretty much everything backed up onto one of my other servers. It just needs to be updated this month to make sure all the new files have been added/replaced. But writing it all back to the new pool would take a long time.

Would I gain or loose any storage capacity with going to a RAIDZ3 using the extra disk?

Is RAIDZ3 faster than RAIDZ2?
Code:
At this time, RAIDZ2 on FreeBSD is slower than RAIDZ1.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Yes, RAIDZ2 is slower than a RAIDZ1 and I suspect a RAIDZ3 would be slightly slower than a RAIDZ2. I have no idea what you are using your system for but you have a spare drive that there has been a lot of debate on and in the end, it's pretty much a waste of resources as a spare drive.

You can stop the scrub and detach that drive, you will be good. Reboot if you want to get a good feeling on if your pool is doing well and then check it's status.
 

Yatti420

Wizard
Joined
Aug 12, 2012
Messages
1,437
Sorry I looked over your op to quickly.. Indeed that one drive is dying..
 
Status
Not open for further replies.
Top