Replaced bad drive, permanent errors won't go away

Status
Not open for further replies.

toblix

Cadet
Joined
Jun 4, 2011
Messages
4
I had some issues with a bad disk in a four disk pool, and replaced the bad disk with a new one. A zpool status -v told me some files were permanently damaged, so I deleted them, as they were not important. zpool status -v now tells me:

Code:
freenas# zpool status -v
  pool: ZFSPool0
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

        NAME                      STATE     READ WRITE CKSUM
        ZFSPool0                  DEGRADED     0     0     9
          raidz1                  DEGRADED     0     0    36
            ada2                  ONLINE       0     0     0
            replacing             ONLINE       0     0     0
              ada3                ONLINE       0     0     0
              ada0                ONLINE       0     0     0
            ada1                  ONLINE       0     0     0
            11846808060678788522  UNAVAIL      0     0     0  was /dev/ada3

errors: Permanent errors have been detected in the following files:

        ZFSPool0:<0x932d1>
        ZFSPool0:<0x932d4>
        ZFSPool0:<0x932e3>
        ZFSPool0:<0x932e6>
        ZFSPool0:<0x932ec>


Could someone please explain what's going on here? Why are two of the four disks in the volume listed under replacing? If I request a scrub, it just goes ahead and resilvers ada0, which takes about 30hrs each time, and doesn't change the error messages I get. Is there any way to clear the errors (I deleted the files, after all?) Is there any way of removing 11846808060678788522, as it seems to be referring to a disk that doesn't physically exist in the computer. I'm pretty sure I have four working disks in the computer now, so I don't understand why ZFS doesn't agree.
 
Joined
May 27, 2011
Messages
566
you can remove the old device with
zpool remove ZFSPool0 11846808060678788522

once you are sure you want to clear the errors, use the command
zpool clear ZFSPool0
 

toblix

Cadet
Joined
Jun 4, 2011
Messages
4
Thanks, but I think I remember trying that, and ZFS politely refusing. My current theory is that I got the disks mixed up and replaced one of the ok drives, rather than the busted one. I'll try replacing it with a fifth drive in a couple of days, and see if that works.
 
I

ixdwhite

Guest
You will have to find and delete the damaged files to make the message go away since they're irrevocably damaged due to a multi-disk failure. If you swapped the wrong drive it is probably too late to undo that mistake as ZFS has already updated the zpool generation; the pulled disk will not be incorporated into the pool as the data on it is stale.
 

toblix

Cadet
Joined
Jun 4, 2011
Messages
4
I did delete the files. At first it listed the file names, and it was after I had deleted them it listed the codes instead. As for replacing the wrong disk, I did physically replace the right one, but I suspect I might have zpool replaced one of the good ones (ada0) instead of the one I physically removed (11846808060678788522). At least that would explain the status output, in my admittedly amateur opinion.
 

toblix

Cadet
Joined
Jun 4, 2011
Messages
4
Turns out I was right – I had replaced the wrong disk, and now I've fixed that. Current output is:
Code:
freenas# zpool status -v
  pool: ZFSPool0
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ZFSPool0    ONLINE       0     0     9
          raidz1    ONLINE       0     0    36
            ada3    ONLINE       0     0     0
            ada4    ONLINE       0     0     0
            ada1    ONLINE       0     0     0
            ada2    ONLINE       0     0     0
        spares
          ada0      AVAIL

errors: Permanent errors have been detected in the following files:

        ZFSPool0:<0x932d1>
        ZFSPool0:<0x932d4>
        ZFSPool0:<0x932e3>
        ZFSPool0:<0x932e6>
        ZFSPool0:<0x932ec>


From my experience, requesting a scrub will make it resilver ada2 (the new disk) for forty hours and find the same errors again. As I said, I'm not interested in the files that were broken (which is why I deleted them, expecting the errors to go away rather than be turned into some sort of cybercodes) but zpool clear ZFSPool0 does nothing. Any tips? Am I going to have to live with these undecipherable errors for the rest of the pool's life?
 
Status
Not open for further replies.
Top