Identifying disk with checksum error

Patrick_3000 · Mar 18, 2024

I have an HDD pool consisting of a three-way mirror: two 10 TB Seagate Ironwolf Pro drives that I bought new approximately 1.5 years ago, and a 12 TB NAS drive that I bought used (refurbished) and rebranded around the same time.

Yesterday, during a weekly scrub, SCALE found a checksum error of 1 bit on one of the drives and now says that the pool is unhealthy due to an unrecoverable error. Unfortunately, I am unable to determine which drive had the checksum error, although interestingly, I strongly suspect that it was one of the Ironwolf Pro drives. (Incidentally, I ran another scrub today, and all three drives got a zero checksum with no errors).

The problem is that "zpool status" identifies drives only by what looks to be UUID, and the UUID shown for the drive that had the checksum error does not correspond to any UUID revealed by the "fdisk -l" command, although it is extremely close (but not identical) to the UUID for both Ironwolf Pro drives.

So, does anyone know how to determine which drive had a checksum error from the output of "zpool status"?

chuck32 · Mar 18, 2024

Try zpool status -LP

Patrick_3000 · Mar 19, 2024

chuck32 said:
Try zpool status -LP

That doesn't show any more information regarding the disk identification than "zpool status" does. There appears to be something wrong in that the identifier of the disk using zpool status, which appears to be a UUID, does not match the UUIDs shown with "fdisk -l."

In any case, I have run scrub since I got the checksum error, and it showed no errors in that subsequent scrub. I am in the process of running a "long" SMART test on all disks and will see if any errors are reported. If no errors are reported, I'll assume the checksum error was a fluke and not do anything further for the moment.

chuck32 · Mar 20, 2024

Sorry then I misunderstood what you are trying to achieve.

I'm on mobile now so I can't check. The mentioned command should help you identify the drive with sdX / adX.

That information can be used in the GUI / smartctl to identify the disk via its serial number and to see which smart test results are for the disk that had errors.

What information are you trying to piece together?

Doesn't fdisk list the partition tables?

danb35 · Mar 20, 2024

Patrick_3000 said:
The problem is that "zpool status" identifies drives only by what looks to be UUID

Why not look at the pool status page in the GUI? It will list the devices by name (e.g., sda, sdb, etc.). From there, the Disks page will list serial numbers.

Patrick_3000 said:
There appears to be something wrong in that the identifier of the disk using zpool status, which appears to be a UUID, does not match the UUIDs shown with "fdisk -l."

It's a partition UUID. lsblk -o NAME,SIZE,PARTUUID should give you the information you need if you want to do it the hard way.

Important Announcement for the TrueNAS Community.

Identifying disk with checksum error

Patrick_3000

Contributor

chuck32

Guru

Patrick_3000

Contributor

chuck32

Guru

danb35

Hall of Famer

Similar threads

Important Announcement for the TrueNAS Community.

Identifying disk with checksum error

Patrick_3000

Contributor

chuck32

Guru

Patrick_3000

Contributor

chuck32

Guru

danb35

Hall of Famer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Identifying disk with checksum error"

Similar threads