Looking for steps to recover pool on a 4 disk RAIDZ1

TheUsD

Contributor
Joined
May 17, 2013
Messages
116
Running:
OS: TrueNAS-12.0-U2.1,
CPU: i5-8500
RAM: 32GB DDR4

Looking for guidance / steps / article or a solved post that describes the process to recover pool on a 4 disk RAIDZ1.
The current RAIDZ1 contains 4, 4TB disks, we'll call it "Pool 2" and contains one vdev. I have 4, 8TB disks being delivered today and would like to move the current data on "Pool 2" to my other pool, "Pool 1", then replace the degraded "Pool 2" with the new 4, 8TB drives.

Alerts are showing several critical errors for a single disk.
Errors are:

1) Pool-02 state is DEGRADED: One or more devices are faulted in response to IO failures.
The following devices are not healthy:
  • Disk 1<SerialNumber> is FAULTED

2) Device: /dev/da5 [SAT], 120 Currently unreadable (pending) sectors.​

3) Device: /dev/da5 [SAT], 120 Offline uncorrectable sectors.​

4) Device: /dev/da5 [SAT], not capable of SMART self-check.​

5) Device: /dev/da5 [SAT], failed to read SMART Attribute Data.​

6) Device: /dev/da5 [SAT], Read SMART Self-Test Log Failed.​

7) Device: /dev/da5 [SAT], Read SMART Error Log Failed.​


What is confusing to me is all the logs point to disk da5, however when I look at the pool status I see the following:
1635611065423.png

The pool contains the following disks: da4, da5, da6, and da7 for storage. disk nvd0p1 is for logging and disk nvd1p1 is for cache.

Thanks in advance for any help.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Look at the numbers in the columns. Those are the counts for read errors, write errors, and checksum errors. Your pool has 2 faulty drives, one of which had so many errors it was placed in FAULTED status. This is the drive in error 1. da5 has a much lower number of read/write errors, and corresponds to errors 2-7.
 

TheUsD

Contributor
Joined
May 17, 2013
Messages
116
Look at the numbers in the columns. Those are the counts for read errors, write errors, and checksum errors. Your pool has 2 faulty drives, one of which had so many errors it was placed in FAULTED status. This is the drive in error 1. da5 has a much lower number of read/write errors, and corresponds to errors 2-7.

Thank you for the clarification. I did not notice that but I now understand. I rebooted the storage container and the pool was able to come back up. I am now exporting the data to Pool 1. Hopefully it will stay alive long enough to complete the task.
 
Top