Replaced faulted drive but remaining drives still say DEGRADED

Fongaboo

Dabbler
Joined
Nov 12, 2013
Messages
30
2023-04-14_18-26-49.png


ada2 failed. I swapped in another drive and successfully resilvered/brought it online. But the other three still show to DEGRADED. How can I remedy that?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Output of zpool status -v please. As text - easiest to use SSH to login and copy & paste. Thanks.
 

Fongaboo

Dabbler
Joined
Nov 12, 2013
Messages
30
Code:
# zpool status -v
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:01:23 with 0 errors on Fri Apr 14 03:46:23 2023
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors

  pool: zroot_new
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: resilvered 1.28T in 14:54:39 with 11 errors on Fri Apr  7 11:36:54 2023
config:

        NAME                                            STATE     READ WRITE CKSUM
        zroot_new                                       DEGRADED     0     0     0
          raidz1-0                                      DEGRADED     0     0     0
            gptid/617b66c9-272b-11ed-80ba-d4ae52b98ddc  DEGRADED     0     0    38  too many errors
            gptid/935fa701-d4dc-11ed-9265-d4ae52b98ddc  ONLINE       0     0    40
            gptid/6170b82b-272b-11ed-80ba-d4ae52b98ddc  DEGRADED     0     0    38  too many errors
            gptid/6185bfd4-272b-11ed-80ba-d4ae52b98ddc  DEGRADED     0     0    40  too many errors

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <0x2d0c>:<0x2f336b>
        <0x2d0c>:<0x25f5b5>
        <0x3310>:<0x3b87f2>
        <0xb59>:<0x596b8>
        /mnt/zroot_new/USERS/meg/Documents/Old Computer/Desktop/downloaded music/Into the Wild/16 The Rapids & The Canyon.mp3
        /mnt/zroot_new/USERS/Fongaboo.local.bak/AppData/Roaming/Signal/attachments.noindex/32/326be9fcb5c65091a3b823d3bbb39a5590ccacc67c6fbcb3e39460b2b2148451
        /mnt/zroot_new/USERS/Fongaboo.local.bak/AppData/Local/Google/Chrome/User Data/Default/Service Worker/CacheStorage/5a6f7e336992bc24678958dc2f1f9b9eec83593b/0cf0cdd3-67cc-4426-a691-f352c33c1931/84e6de8b9c9d8416_1
        /mnt/zroot_new/USERS/Fongaboo.local.bak/AppData/Roaming/Signal/attachments.noindex/37/37355cbcb898d2f2440441046ec3a2cd9a53d4d33c46371b269649da77861f6b
        <0xa82>:<0x2da50a>
        <0xa82>:<0x2d9fcc>
        <0x1bf3>:<0x907b>
        <0x1bf3>:<0x907d>


I thought I had removed all corrupt files, but I guess not. Would removing the remaining few do the trick?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
To my knowledge the metadata errors are not fixable, at least not in a simple way. Best to do a backup and recreate the pool, IMHO. I think I remember @Samuel Tai giving sound advice in cases like this. Maybe he can give some more help.
 
Joined
Oct 22, 2019
Messages
3,641
Are these drives connected to the same controller? When you get a bunch of checksum errors like that across all drives, it could be your HBA and/or poor connections.
 

Fongaboo

Dabbler
Joined
Nov 12, 2013
Messages
30
Yes. But I have another PCI card with 4 more SATA ports I could migrate them to.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Yes. But I have another PCI card with 4 more SATA ports I could migrate them to.
A PCIe SATA controller?
How are your drives connected? An improper controller could be the cause of your troubles.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Erg I dont want to loose my snapshots. Get an 8TB drive and do a zpool send?
It is a possibility. I wouldn't expect that sending a broken ZFS dataset/snapshot to another drive would clear the errors.
However, it is still a valid maneuver in the process of getting to a stable state with your data.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
If your data is replaceable and you could risk losing it you can do
zpool clear POOLNAME
That command will remove the checksum error entries and get rid of the alerts, but the checksum errors occured for a reason.

But the other folks here are most likely correct. It sounds like your controller is the issue. If you are using a crappy Amazon 4-port SATA to PCI-E card, these types of problems are always going to come up. Those cards are terrible and will destroy your data.
 
Top