Failing disk and time to change RAID-z strategy?

Status
Not open for further replies.

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
Hey Roger & Eric,

of course I tried selecting the drive in the list. Like I said in post #16 all I get is a blue bar at the bottom of the screen with "edit" and "wipe" buttons, no replace, offline, or whatever else... I will post a screenshot when I come back home.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
You have to click the pool's entry in the view volumes screen. Then click the pool status button that appears. In that screen, you do have an OFFLINE button.

Definitely not very intuitive.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Hmm OK for the errata but that doesn't explain why I cannot follow the manual to the letter...

As per my screenshot I provided, you see that I have no "replace", "offline" or "other" buttons, so clearly there is a discrepancy between the manual and the version of Freenas I run.
Tonight I will post a screenshot of the "View Volumes" page showing no such buttons as well... View Volumes shows my datasets, not the underlying disks.

EDIT: Based on my comments, would it be possible that the "Offline" or "Replace" buttons are not there because the disk has not failed yet? Technically it has not failed as it is still responding to freenas and has only bad sectors.. In that case can I pull the drive, then pop the replacement in and proceed with the instructions of the manual?

Im surprised nobody has encountered this scenario yet!?

The problem is that you aren't following the manual to the letter. The pool status and disk listing are VERY similar in look, but they are NOT the same. The disk listing lets you edit and wipe disks. The pool status lets you do disk replacements and such.
 

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
OK replacement of the drive done and its resilvering as we speak.

Yeah really I just noticed the white icon in the Volume page (volume status)... definitely not jumping in your face but nevertheless I should have tried harder instead of bit*** that it wasnt there.

Some end user feedback: the replacement procedure is a piece of cake and thanks to Freenas it was done within a few minutes. On the other hand lets wait the 30hrs or so it will take to resilver the array..

Will post back.

BIG THANKS to all of you so far for patience and help!!!!
 

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
I know 30h seems long enough.... its a 10TB pool.. How long would you think ? Right now I am anticipating a catastrophic failure with more than one drive failing... Touching wood that it doenst happen!

The status of the pool

[root@freenas] ~# zpool status
pool: zpool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Dec 2 18:24:37 2014
482G scanned out of 10.8T at 96.3M/s, 31h8m to go
55.3G resilvered, 4.37% done
config:

NAME STATE READ WRITE CKSUM
zpool DEGRADED 0 0 0
raidz3-0 DEGRADED 0 0 0
gptid/70057ca3-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0
gptid/7231ce76-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0
gptid/74010031-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0
replacing-3 OFFLINE 0 0 0
16870590521726310585 OFFLINE 0 0 0 was /dev/gptid/74c45142-0fb8-11e4-9267-0030487f11ba
gptid/60d7ae11-7a7a-11e4-81f9-0030487f11ba ONLINE 0 0 0 (resilvering)
gptid/7577d07e-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0
gptid/7799b692-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0
gptid/7979c1c6-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0
gptid/7ba4673f-0fb8-11e4-9267-0030487f11ba ONLINE 0 0 0

errors: No known data errors
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Can you repost that output in pastebin? That output means nothing since the formatting was trashed due to the forum software.
 

Borja Marcos

Contributor
Joined
Nov 24, 2014
Messages
125
To sort out these troubles, gstat is your friend. Log as root via ssh and execute gsstat with a short inverval (gstat -I1s). You can see if one of the disks is being much slower than the others, which would give you a clue. In my never polished "devilator" (a FreeBSD data collector for Orca) I graph disk delay for that reason.

Second: for flashed LSI2008 cards there is a way that will help you identify a disk slot. There is a program called sas2ircu (you can download it from the LSI website) and there's even a FreeBSD port for it. Even though it's intended for the "Megaraid" cards, a simple "sas2ircu 0 display" will give you a nice listing of SAS devices _including_ the backplane slot number.

At least in my case I have found that the identification works, it doesn't get fooled by LSI's odd reordering of SAS target numbers. Sample output:

Code:
# sas2ircu 0 display
LSI Corporation SAS2 IR Configuration Utility.
Version 18.00.00.00 (2013.11.18)
Copyright (c) 2009-2013 LSI Corporation. All rights reserved.

Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
  Controller type                         : SAS2008
  BIOS version                            : 7.37.00.00
  Firmware version                        : 19.00.00.00
  Channel description                     : 1 Serial Attached SCSI
  Initiator ID                            : 0
  Maximum physical devices                : 255
  Concurrent commands supported           : 3432
  Slot                                    : Unknown
  Segment                                 : 0
  Bus                                     : 17
  Device                                  : 0
  Function                                : 0
  RAID Support                            : No
------------------------------------------------------------------------
IR Volume information
------------------------------------------------------------------------
------------------------------------------------------------------------
Physical device information
------------------------------------------------------------------------
Initiator at ID #0

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 16
  SAS Address                             : 5005076-0-3e8e-81a1
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               :NI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 17
  SAS Address                             : 5005076-0-3e8e-81a2
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NINI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 18
  SAS Address                             : 5005076-0-3e8e-81a3
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 19
  SAS Address                             : 5005076-0-3e8e-81a4
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : S1D9NEADA08547X
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 20
  SAS Address                             : 5005076-0-3e8e-81a5
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NINININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 21
  SAS Address                             : 5005076-0-3e8e-81a6
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NININININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 22
  SAS Address                             : 5005076-0-3e8e-81a7
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : S1D9NEADA08568E
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Enclosure services device
  Enclosure #                             : 2
  Slot #                                  : 255
  SAS Address                             : 5005076-0-3e8e-81b9
  State                                   : Standby (SBY)
  Manufacturer                            : IBM-ESXS
  Model Number                            : SAS EXP BP    
  Firmware Revision                       : 61A6
  Serial No                               : 00000006
  GUID                                    : N/A
  Protocol                                : SAS
  Device Type                             : Enclosure services device

Device is a Hard disk
  Enclosure #                             : 3
  Slot #                                  : 12
  SAS Address                             : 5005076-0-3e8e-86e5
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NINININININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 3
  Slot #                                  : 13
  SAS Address                             : 5005076-0-3e8e-86e6
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NINININININININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 3
  Slot #                                  : 14
  SAS Address                             : 5005076-0-3e8e-86e7
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : AB0Q
  Serial No                               : NININININININININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Hard disk
  Enclosure #                             : 3
  Slot #                                  : 15
  SAS Address                             : 5005076-0-3e8e-86e8
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 953869/1953525167
  Manufacturer                            : ATA    
  Model Number                            : Samsung SSD 840
  Firmware Revision                       : BB0Q
  Serial No                               : NINININININININININI
  GUID                                    : N/A
  Protocol                                : SATA
  Drive Type                              : SATA_SSD

Device is a Enclosure services device
  Enclosure #                             : 3
  Slot #                                  : 255
  SAS Address                             : 5005076-0-3e8e-86f9
  State                                   : Standby (SBY)
  Manufacturer                            : IBM-ESXS
  Model Number                            : SAS EXP BP    
  Firmware Revision                       : 61A6
  Serial No                               : 00000006
  GUID                                    : N/A
  Protocol                                : SAS
  Device Type                             : Enclosure services device
------------------------------------------------------------------------
Enclosure information
------------------------------------------------------------------------
  Enclosure#                              : 1
  Logical ID                              : 500605b0:07ba2100
  Numslots                                : 8
  StartSlot                               : 0
  Enclosure#                              : 2
  Logical ID                              : 50050760:3e8e81a0
  Numslots                                : 25
  StartSlot                               : 0
  Enclosure#                              : 3
  Logical ID                              : 50050760:3e8e86e0
  Numslots                                : 25
  StartSlot                               : 0
------------------------------------------------------------------------
SAS2IRCU: Command DISPLAY Completed Successfully.
SAS2IRCU: Utility Completed Successfully.



Of course, always _do_ verify that it really identifies your slots accurately. At least in my case it works with IBM and Dell backplanes. And yes, the output is unsorted but the slot number is correct.
 

freenas-supero

Contributor
Joined
Jul 27, 2014
Messages
128
Thanks Borja for the precios info! When resilvering is complete I will try these, they may save me some headache in the future!

As for the resilvering, I just checked, its been running for 13 hours and its 80% done so I guess the initial estimate of 30hrs was wayyy off....
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The estimate is based on current throughput. If you hit large files that are large blocks that are not badly scattered over the media and throughput will increase. On the other hand if you run into a billion 1KB files throughput will drop significantly. I just wait it out and as long as there's no indicator that something is broken then I let it be.

I put in a ticket to request that sas2ircu be added to FreeNAS. https://bugs.freenas.org/issues/6945
 
Status
Not open for further replies.
Top