Replaced faulted drive but remaining drives still say DEGRADED

Fongaboo · Apr 14, 2023

ada2 failed. I swapped in another drive and successfully resilvered/brought it online. But the other three still show to DEGRADED. How can I remedy that?

Patrick M. Hausen · Apr 14, 2023

Output of zpool status -v please. As text - easiest to use SSH to login and copy & paste. Thanks.

Fongaboo · Apr 14, 2023

Code:

# zpool status -v
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:01:23 with 0 errors on Fri Apr 14 03:46:23 2023
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors

  pool: zroot_new
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: resilvered 1.28T in 14:54:39 with 11 errors on Fri Apr  7 11:36:54 2023
config:

        NAME                                            STATE     READ WRITE CKSUM
        zroot_new                                       DEGRADED     0     0     0
          raidz1-0                                      DEGRADED     0     0     0
            gptid/617b66c9-272b-11ed-80ba-d4ae52b98ddc  DEGRADED     0     0    38  too many errors
            gptid/935fa701-d4dc-11ed-9265-d4ae52b98ddc  ONLINE       0     0    40
            gptid/6170b82b-272b-11ed-80ba-d4ae52b98ddc  DEGRADED     0     0    38  too many errors
            gptid/6185bfd4-272b-11ed-80ba-d4ae52b98ddc  DEGRADED     0     0    40  too many errors

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <0x2d0c>:<0x2f336b>
        <0x2d0c>:<0x25f5b5>
        <0x3310>:<0x3b87f2>
        <0xb59>:<0x596b8>
        /mnt/zroot_new/USERS/meg/Documents/Old Computer/Desktop/downloaded music/Into the Wild/16 The Rapids & The Canyon.mp3
        /mnt/zroot_new/USERS/Fongaboo.local.bak/AppData/Roaming/Signal/attachments.noindex/32/326be9fcb5c65091a3b823d3bbb39a5590ccacc67c6fbcb3e39460b2b2148451
        /mnt/zroot_new/USERS/Fongaboo.local.bak/AppData/Local/Google/Chrome/User Data/Default/Service Worker/CacheStorage/5a6f7e336992bc24678958dc2f1f9b9eec83593b/0cf0cdd3-67cc-4426-a691-f352c33c1931/84e6de8b9c9d8416_1
        /mnt/zroot_new/USERS/Fongaboo.local.bak/AppData/Roaming/Signal/attachments.noindex/37/37355cbcb898d2f2440441046ec3a2cd9a53d4d33c46371b269649da77861f6b
        <0xa82>:<0x2da50a>
        <0xa82>:<0x2d9fcc>
        <0x1bf3>:<0x907b>
        <0x1bf3>:<0x907d>

I thought I had removed all corrupt files, but I guess not. Would removing the remaining few do the trick?

Patrick M. Hausen · Apr 14, 2023

To my knowledge the metadata errors are not fixable, at least not in a simple way. Best to do a backup and recreate the pool, IMHO. I think I remember @Samuel Tai giving sound advice in cases like this. Maybe he can give some more help.

Fongaboo · Apr 14, 2023

Erg I dont want to loose my snapshots. Get an 8TB drive and do a zpool send?

winnielinnie · Apr 14, 2023

Are these drives connected to the same controller? When you get a bunch of checksum errors like that across all drives, it could be your HBA and/or poor connections.

Fongaboo · Apr 14, 2023

Yes. But I have another PCI card with 4 more SATA ports I could migrate them to.

Etorix · Apr 15, 2023

Fongaboo said:
Yes. But I have another PCI card with 4 more SATA ports I could migrate them to.

A PCIe SATA controller?
How are your drives connected? An improper controller could be the cause of your troubles.

Multiply your problems with SATA Port Multipliers and cheap SATA controllers

jgreco submitted a new resource: Multiply your problems with SATA Port Multipliers and cheap SATA controllers - Bad technology, multiplied by more evil In the last year or two, we've had a resurgence of users asking about SATA Port Multipliers and cheap SATA controllers. Please, do NOT use...

www.truenas.com

Dice · Apr 16, 2023

Fongaboo said:
Erg I dont want to loose my snapshots. Get an 8TB drive and do a zpool send?

It is a possibility. I wouldn't expect that sending a broken ZFS dataset/snapshot to another drive would clear the errors.
However, it is still a valid maneuver in the process of getting to a stable state with your data.

NickF · Apr 16, 2023

If your data is replaceable and you could risk losing it you can do
zpool clear POOLNAME
That command will remove the checksum error entries and get rid of the alerts, but the checksum errors occured for a reason.

But the other folks here are most likely correct. It sounds like your controller is the issue. If you are using a crappy Amazon 4-port SATA to PCI-E card, these types of problems are always going to come up. Those cards are terrible and will destroy your data.

Important Announcement for the TrueNAS Community.

Replaced faulted drive but remaining drives still say DEGRADED

Fongaboo

Dabbler

Patrick M. Hausen

Hall of Famer

Fongaboo

Dabbler

Patrick M. Hausen

Hall of Famer

Fongaboo

Dabbler

winnielinnie

MVP

Fongaboo

Dabbler

Etorix

Wizard

Multiply your problems with SATA Port Multipliers and cheap SATA controllers

Dice

Wizard

NickF

Guru

Similar threads

Important Announcement for the TrueNAS Community.

Replaced faulted drive but remaining drives still say DEGRADED

Dabbler

Hall of Famer

Dabbler

Hall of Famer

Dabbler

MVP

Dabbler

Wizard

Wizard

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Replaced faulted drive but remaining drives still say DEGRADED"

Similar threads