Guidance to proceed

brownj

Cadet
Joined
Nov 2, 2019
Messages
7
My pool "data-pool" is showing as degraded. I powered down reseated the sata cables and powered back up and the status is the same. It's reporting /dev/sdc is FAULTED. If i run "smartctl -a /dev/sdc" it appears everything has PASSED and i really don't see anything horrible when comparing
it to another ONLINE disk ( /dev/sdb) using "smartctl -a /dev/sdb" , at this point ( not being confident in my ability to interpret smartctl output)

do i ONLINE the disk ?

also my SPARE /dev/sde is showing UNAVAIL , which i guess means it was pulled into the pool after the FAULT ??
am i correct on this ?

How to proceed ( i purchased a WD-RED replacement sitting here ) ready to go ....
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
If i run "smartctl -a /dev/sdc" it appears everything has PASSED
The word PASSED in the output doesn't constitute a healthy disk. Please share the output (minus the serial number if you prefer not to share it)

Let's start out with a look at what happened to the spare...

zpool status -v
 

brownj

Cadet
Joined
Nov 2, 2019
Messages
7
full disclosure , i went ahead and set /dev/sdc OFFLINE powered down and replaced it with a new WD-RED , after bringing it back up i REPLACED the disk in the UI and now 'zpool status -v" reports:

pool: data-pool
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Sep 27 06:54:32 2022
1.54T scanned at 349M/s, 756G issued at 167M/s, 1.54T total
251G resilvered, 47.93% done, 01:23:43 to go
config:

NAME STATE READ WRITE CKSUM
data-pool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
spare-0 ONLINE 0 0 0
c67c4bdc-91dc-4c14-8f34-15504a4c0f57 ONLINE 0 0 0 (resilvering)
sde ONLINE 0 0 0
1e86f2c4-9e44-11eb-bf24-2cf05da31333 ONLINE 0 0 2
1ea5da57-9e44-11eb-bf24-2cf05da31333 ONLINE 0 0 0
spares
sde INUSE currently in use

errors: No known data errors

How do i interpret this ? -> 'sde INUSE currenty in use' and , what is spare-0 exactly ?
 

brownj

Cadet
Joined
Nov 2, 2019
Messages
7
Also , when resilver is complete , what do i do the get a spare back AVAIL , as i recall thats what it looked like
when i added it back when ....
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
sde is part of the spare-0 group (which would be much easier to see if you had posted that in code tags).

The way that spares work is that you need to decide what to do...

You can either remove the dead drive and remove the spare from the spares list, permanently re-assigning it to the pool, or you can replace the dead drive again (since the spare already did that once) and return the spare to the spares list.

Neither action is automatic, although I note you have already taken steps to go the second way.

zpool detach data-pool sde is probably the command you'll be looking for once the resilver is done.
 

brownj

Cadet
Joined
Nov 2, 2019
Messages
7
sde is part of the spare-0 group (which would be much easier to see if you had posted that in code tags).

The way that spares work is that you need to decide what to do...

You can either remove the dead drive and remove the spare from the spares list, permanently re-assigning it to the pool, or you can replace the dead drive again (since the spare already did that once) and return the spare to the spares list.

Neither action is automatic, although I note you have already taken steps to go the second way.

zpool detach data-pool sde is probably the command you'll be looking for once the resilver is done.
Thanks for the info , much appreciated.
 

brownj

Cadet
Joined
Nov 2, 2019
Messages
7
sde is part of the spare-0 group (which would be much easier to see if you had posted that in code tags).

The way that spares work is that you need to decide what to do...

You can either remove the dead drive and remove the spare from the spares list, permanently re-assigning it to the pool, or you can replace the dead drive again (since the spare already did that once) and return the spare to the spares list.

Neither action is automatic, although I note you have already taken steps to go the second way.

zpool detach data-pool sde is probably the command you'll be looking for once the resilver is done.
@sretalla
so following this a day or 2 later i had another faulted drive , i offlined that drive and replaced with another new WD-RED
at this point my status appears like this:

Code:
 pool: data-pool
 state: ONLINE
  scan: resilvered 524G in 02:29:39 with 0 errors on Sun Oct  2 17:25:37 2022
config:

    NAME                                        STATE     READ WRITE CKSUM
    data-pool                                   ONLINE       0     0     0
      raidz1-0                                  ONLINE       0     0     0
        spare-0                                 ONLINE       0     0     0
          c67c4bdc-91dc-4c14-8f34-15504a4c0f57  ONLINE       0     0     0
          sde                                   ONLINE       0     0     0
        71800691-a44f-41e2-b5db-a04c6b3eb0c4    ONLINE       0     0     0
        1ea5da57-9e44-11eb-bf24-2cf05da31333    ONLINE       0     0     0
    spares
      sde                                       INUSE     currently in use

errors: No known data errors


at this point do i run :
Code:
zpool detach data-pool sde 

to return sde to spare-0 group. ??
 
Top