Looking for steps to recover pool on a 4 disk RAIDZ1

TheUsD · Oct 30, 2021

Running:
OS: TrueNAS-12.0-U2.1,
CPU: i5-8500
RAM: 32GB DDR4

Looking for guidance / steps / article or a solved post that describes the process to recover pool on a 4 disk RAIDZ1.
The current RAIDZ1 contains 4, 4TB disks, we'll call it "Pool 2" and contains one vdev. I have 4, 8TB disks being delivered today and would like to move the current data on "Pool 2" to my other pool, "Pool 1", then replace the degraded "Pool 2" with the new 4, 8TB drives.

Alerts are showing several critical errors for a single disk.
Errors are:

1) Pool-02 state is DEGRADED: One or more devices are faulted in response to IO failures.
The following devices are not healthy:

Disk 1<SerialNumber> is FAULTED

2) Device: /dev/da5 [SAT], 120 Currently unreadable (pending) sectors.

3) Device: /dev/da5 [SAT], 120 Offline uncorrectable sectors.

4) Device: /dev/da5 [SAT], not capable of SMART self-check.

5) Device: /dev/da5 [SAT], failed to read SMART Attribute Data.

6) Device: /dev/da5 [SAT], Read SMART Self-Test Log Failed.

7) Device: /dev/da5 [SAT], Read SMART Error Log Failed.

What is confusing to me is all the logs point to disk da5, however when I look at the pool status I see the following:

The pool contains the following disks: da4, da5, da6, and da7 for storage. disk nvd0p1 is for logging and disk nvd1p1 is for cache.

Thanks in advance for any help.

Samuel Tai · Oct 30, 2021

Look at the numbers in the columns. Those are the counts for read errors, write errors, and checksum errors. Your pool has 2 faulty drives, one of which had so many errors it was placed in FAULTED status. This is the drive in error 1. da5 has a much lower number of read/write errors, and corresponds to errors 2-7.

TheUsD · Oct 30, 2021

Samuel Tai said:
Look at the numbers in the columns. Those are the counts for read errors, write errors, and checksum errors. Your pool has 2 faulty drives, one of which had so many errors it was placed in FAULTED status. This is the drive in error 1. da5 has a much lower number of read/write errors, and corresponds to errors 2-7.

Thank you for the clarification. I did not notice that but I now understand. I rebooted the storage container and the pool was able to come back up. I am now exporting the data to Pool 1. Hopefully it will stay alive long enough to complete the task.

Important Announcement for the TrueNAS Community.

Looking for steps to recover pool on a 4 disk RAIDZ1

TheUsD

Contributor

1) Pool-02 state is DEGRADED: One or more devices are faulted in response to IO failures.
The following devices are not healthy:

Disk 1<SerialNumber> is FAULTED

2) Device: /dev/da5 [SAT], 120 Currently unreadable (pending) sectors.

3) Device: /dev/da5 [SAT], 120 Offline uncorrectable sectors.

4) Device: /dev/da5 [SAT], not capable of SMART self-check.

5) Device: /dev/da5 [SAT], failed to read SMART Attribute Data.

6) Device: /dev/da5 [SAT], Read SMART Self-Test Log Failed.

7) Device: /dev/da5 [SAT], Read SMART Error Log Failed.

Samuel Tai

Never underestimate your own stupidity

TheUsD

Contributor

Similar threads

Important Announcement for the TrueNAS Community.

Looking for steps to recover pool on a 4 disk RAIDZ1

TheUsD

Contributor

1) Pool-02 state is DEGRADED: One or more devices are faulted in response to IO failures. The following devices are not healthy: Disk 1<SerialNumber> is FAULTED

2) Device: /dev/da5 [SAT], 120 Currently unreadable (pending) sectors.​

3) Device: /dev/da5 [SAT], 120 Offline uncorrectable sectors.​

4) Device: /dev/da5 [SAT], not capable of SMART self-check.​

5) Device: /dev/da5 [SAT], failed to read SMART Attribute Data.​

6) Device: /dev/da5 [SAT], Read SMART Self-Test Log Failed.​

7) Device: /dev/da5 [SAT], Read SMART Error Log Failed.​

Samuel Tai

Never underestimate your own stupidity

TheUsD

Contributor

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Looking for steps to recover pool on a 4 disk RAIDZ1"

Similar threads

1) Pool-02 state is DEGRADED: One or more devices are faulted in response to IO failures.
The following devices are not healthy:

Disk 1<SerialNumber> is FAULTED

2) Device: /dev/da5 [SAT], 120 Currently unreadable (pending) sectors.

3) Device: /dev/da5 [SAT], 120 Offline uncorrectable sectors.

4) Device: /dev/da5 [SAT], not capable of SMART self-check.

5) Device: /dev/da5 [SAT], failed to read SMART Attribute Data.

6) Device: /dev/da5 [SAT], Read SMART Self-Test Log Failed.

7) Device: /dev/da5 [SAT], Read SMART Error Log Failed.