Help recovering DEGRADED zpool mirror

juanjico

Dabbler
Joined
Sep 18, 2012
Messages
31
Thanks @Apollo , but after run the clear command, a new resilver has been triggered.

Code:
[root@freenas] ~# zpool reset zfs1
[root@freenas] ~# zpool scrub zfs1
cannot scrub zfs1: currently resilvering
[root@freenas] ~#



Current status:

Code:
[root@freenas] ~# zpool status -v zfs1
  pool: zfs1
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Apr 20 18:38:02 2019
        2.6G scanned out of 3.04T at 5.31M/s, 166h15m to go
        3.2G resilvered, 0.41% done
config:

        NAME                                              STATE     READ WRITE CKSUM
        zfs1                                              DEGRADED     0     0   492
          mirror-0                                        DEGRADED     0     0 1.92K
            gptid/03cc6e09-df0f-11e5-9697-009c02a7fa32    DEGRADED     0     0 1.92K  too many errors
            replacing-1                                   DEGRADED 1.92K     0     0
              2975835688515449866                         UNAVAIL      0     0     0  was /dev/gptid/9d79ca97-47f5-11e8-b3a4-009c02a7fa32
              gptid/d07286d1-6105-11e9-a0f5-009c02a7fa32  ONLINE       0     0 1.92K  (resilvering)
            gptid/b4dd3975-4f65-11e8-ab2f-009c02a7fa32    ONLINE       0     0 1.92K  (resilvering)

errors: Permanent errors have been detected in the following files:

        /mnt/zfs1/jails/OpenVPN/tmp
        <0xffffffffffffffff>:<0x0>
[root@freenas] ~#


But, now, even the new WD RED disk is reporting CKSUM errors, with same numbers as other disks.

Why the resilvering started again ?
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
You didn't run a 'clear' command but a 'reset' command which I am not at all aware of.
 

juanjico

Dabbler
Joined
Sep 18, 2012
Messages
31
You didn't run a 'clear' command but a 'reset' command which I am not at all aware of.

Ouch! Sorry, just typed the command here instead of copy&paste from terminal. I issued the command using a SSH terminal app from my smartphone. When I got the scrub error, then go to the computer and recreated the commands here.

I checked the terminal history and was a 'clear' command.

My bad!
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
Not a good idea running ssh from a smartphone. So many things can go wrong.
When you deal with SSH or CLI in general, you always have to triple check your next step to prevent mistakes.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I don't know what the "reset' command does. It is not listed in the "man" page and I can't find any information online.
Trouble thing is that it seems to have accepted it.
I just don't know what it is supposed to do. It could very well destroy the content of the pool.
If this is not the case and the data is still there, then ZFS would have found some errors reading a block and decided to fix it, but the I doubt, too much of a coincidence for that.
 

juanjico

Dabbler
Joined
Sep 18, 2012
Messages
31
I don't know what the "reset' command does. It is not listed in the "man" page and I can't find any information online.
Trouble thing is that it seems to have accepted it.
I just don't know what it is supposed to do. It could very well destroy the content of the pool.
If this is not the case and the data is still there, then ZFS would have found some errors reading a block and decided to fix it, but the I doubt, too much of a coincidence for that.

No, I ran the correct command. Terminal history (UP key) confirmed it. The problem was recreating the terminal commands and output here (because I cannot copy&paste from smartphone), I typed 'reset' instead of 'clear'. But, the command that I ran was 'zpool clear zfs1'.

Sorry for the confusion, my english is not good.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
OK. Much better, in a way.
So if resilver is underway, then it must have detected an error.

Can you do a "ls" of you pool and see if files are there?
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I would think once disk has been replaced successfully the old disk should be taken out of the list. Maybe the issue is that the unavailable disk has not been replaced and it is the way it is to be. It is possible the 'degraded' disk is the cause of the problem and causing the pool to resilver.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
This ever get figured out? Pretty sure I told you 15 days ago that this pool was a goner, would love to know what the result was.
 
Last edited:
Top