Bad Drive

JohnFLi

Contributor
Joined
Sep 26, 2016
Messages
139
Hello all.
I have a bad drive that I need to deal with.
baddrive (2).jpg


Now can I just yank out the offending drive and slide a new one in it's place?
 
Joined
Jan 7, 2015
Messages
1,155
Yeah pretty much. Use the GUI to remove the disk, then physically remove the disk, install the new disk, trigger replacement using the gui, wait for resilver.
 

JohnFLi

Contributor
Joined
Sep 26, 2016
Messages
139
Sounds good....BUT, I'm a bit unsure of a couple of things:
Freenas is saying
CRITICAL: March 22, 2021, 2:38 a.m. - The volume PhotoArchive state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

Yet, when I go to that volume and look at the disks, the one list (in my original post pic) is not part of that volume. I went and looked at teh disks on all of the volumes, and that bad disk isn't part of anything.

Is there a command or something I can run to verify that it isn't part of anythign?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
zpool status -v and camcontrol devlist. Within [CODE] tags please, especially considering your number of drives.
 

JohnFLi

Contributor
Joined
Sep 26, 2016
Messages
139
Thank you for your assistance in this
Code:
zpool status -v
  pool: PhotoArchive
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 0 days 19:33:00 with 0 errors on Sun Feb  7 19:33:08 2021
config:

        NAME                                            STATE     READ WRITE CKSUM
        PhotoArchive                                    ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/e1d0a939-2bbe-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     2
            gptid/e25e697c-2bbe-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/e2f06a22-2bbe-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/e36d1caf-2bbe-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/e3f9afed-2bbe-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/e47daea6-2bbe-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/64b1bb8a-2bc0-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/65b5dbd1-2bc0-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/664e579a-2bc0-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/67682e7f-2bc0-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/67f4f0a1-2bc0-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0
            gptid/690c69bb-2bc0-11e8-8c1e-a0369fb4a0dc  ONLINE       0     0     0

errors: No known data errors


Code:
root@G1PPFreeNas01:~ # camcontrol devlist
<SanDisk SD8SBAT128G1122 Z2333000>  at scbus0 target 0 lun 0 (pass0,ada0)
<SanDisk SD8SBAT128G1122 Z2333000>  at scbus1 target 0 lun 0 (pass1,ada1)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 0 lun 0 (pass2,da0)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 1 lun 0 (pass3,da1)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 2 lun 0 (pass4,da2)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 3 lun 0 (pass5,da3)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 4 lun 0 (pass6,da4)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 5 lun 0 (pass7,da5)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 6 lun 0 (pass8,da6)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 7 lun 0 (pass9,da7)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 8 lun 0 (pass10,da8)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 9 lun 0 (pass11,da9)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 10 lun 0 (pass12,da10)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 11 lun 0 (pass13,da11)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 12 lun 0 (pass14,da12)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 13 lun 0 (pass15,da13)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 14 lun 0 (pass16,da14)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 15 lun 0 (pass17,da15)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 16 lun 0 (pass18,da16)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 17 lun 0 (pass19,da17)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 18 lun 0 (pass20,da18)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 19 lun 0 (pass21,da19)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 20 lun 0 (pass22,da20)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 21 lun 0 (pass23,da21)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 22 lun 0 (pass24,da22)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 23 lun 0 (pass25,da23)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 24 lun 0 (pass26,da24)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 25 lun 0 (pass27,da25)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 26 lun 0 (pass28,da26)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 27 lun 0 (pass29,da27)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 28 lun 0 (pass30,da28)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 29 lun 0 (pass31,da29)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 30 lun 0 (pass32,da30)
<ATA WDC WD101KRYZ-01 01.0>        at scbus11 target 31 lun 0 (pass33,da31)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 0 lun 0 (pass34,da32)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 1 lun 0 (pass35,da33)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 2 lun 0 (pass36,da34)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 3 lun 0 (pass37,da35)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 4 lun 0 (pass38,da36)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 5 lun 0 (pass39,da37)
<ATA WDC WD101KRYZ-01 01.0>        at scbus12 target 6 lun 0 (pass40,da38)
<Kingston DataTraveler 3.0 PMAP>   at scbus13 target 0 lun 0 (pass41,da39)
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
So one drive (should be /dev/da30) had two checksum errors. Probably already repaired by ZFS…
The next step should be smartctl -a /dev/daXX for this drive.
 
Joined
Jan 7, 2015
Messages
1,155
You might also try clearing the error using zpool clear tank then do a fresh scrub zpool scrub tank and see what you get. If there are in fact errors this new scrub will find and repair. I also agree that a full battery of SMART tests smartctl -t long /dev/da## (or just reading the SMART data as @Etorix said if you do these frequently) is the next immediate step.
 

JohnFLi

Contributor
Joined
Sep 26, 2016
Messages
139
thank you for your suggestions.
I ran
Code:
zpool clear PhotoArchive gptid/e1d0a939-2bbe-11e8-8c1e-a0369fb4a0dc


That cleared the error, then i ran
Code:
zpool status -v

it now shows everythign is good again.

I will try scrubbing the tank and see what happens
 

JohnFLi

Contributor
Joined
Sep 26, 2016
Messages
139
Well, scrubbing completed and no reports of disk issues.

Thank you all for your assistance.
 
Top