CAM status: Uncorrectable parity/CRC error on 5 drives

SwisherSweet · Nov 26, 2017

Hi,

I'm getting CAM status: Uncorrectable parity/CRC error on 5 drives according to my dialy security output run. I appreciate any help and advice on how to proceed next to keep my data protected.

(ada5:mvsch1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 18 4e 63 40 9a 00 00 00 00 00
(ada6:mvsch1:0:1:0): WRITE_FPDMA_QUEUED. ACB: 61 a8 98 59 63 40 9a 00 00 00 00 00
(ada8:mvsch1:0:3:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 c0 4d 63 40 9a 00 00 00 00 00
(ada7:mvsch1:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 00 c0 4c 63 40 9a 00 00 01 00 00
(ada9:mvsch1:0:4:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 e8 50 63 40 9a 00 00 00 00 00
(ada5:mvsch1:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:mvsch1:0:1:0): CAM status: Uncorrectable parity/CRC error
(ada5:(ada6:(ada7:mvsch1:0:2:0): CAM status: Uncorrectable parity/CRC error
mvsch1:0:(ada8:mvsch1:0:3:0): CAM status: Uncorrectable parity/CRC error
1:(ada7:0): mvsch1:0:Retrying command
2:(ada8:0): mvsch1:0:Retrying command
3:0): (ada9:mvsch1:0:4:0): CAM status: Uncorrectable parity/CRC error
Retrying command
(ada9:mvsch1:0:mvsch1:0:4:0:0): 0): Retrying command
Retrying command
(ada9:mvsch1:0:4:0): WRITE_FPDMA_QUEUED. ACB: 61 f8 b8 bd 12 40 9f 00 00 00 00 00
(ada7:mvsch1:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 d8 90 ba 12 40 9f 00 00 00 00 00
(ada5:mvsch1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 c0 bd 12 40 9f 00 00 00 00 00
(ada6:mvsch1:0:1:0): WRITE_FPDMA_QUEUED. ACB: 61 f8 70 c2 12 40 9f 00 00 00 00 00
(ada8:mvsch1:0:3:0): WRITE_FPDMA_QUEUED. ACB: 61 f8 b8 bd 12 40 9f 00 00 00 00 00
(ada9:mvsch1:0:4:0): CAM status: Uncorrectable parity/CRC error
(ada7:mvsch1:0:2:0): CAM status: Uncorrectable parity/CRC error
(ada9:(ada7:mvsch1:0:mvsch1:0:4:2:0): 0): Retrying command
Retrying command
(ada8:mvsch1:0:3:0): CAM status: Uncorrectable parity/CRC error
(ada8:mvsch1:0:(ada6:mvsch1:0:1:0): CAM status: Uncorrectable parity/CRC error
3:(ada6:0): mvsch1:0:Retrying command
1:0): (ada5:mvsch1:0:0:0): CAM status: Uncorrectable parity/CRC error
Retrying command
(ada5:mvsch1:0:0:0): Retrying command

Here are the errors summarized:

(ada5:mvsch1:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:mvsch1:0:1:0): CAM status: Uncorrectable parity/CRC error
(ada7:mvsch1:0:2:0): CAM status: Uncorrectable parity/CRC error
(ada8:mvsch1:0:3:0): CAM status: Uncorrectable parity/CRC error
(ada9:mvsch1:0:4:0): CAM status: Uncorrectable parity/CRC error

I have verified that all of these drives are in the same external eSATA enclosure (either backup1 or backup2... hard to know since FreeNAS doesn't tell which drives map to which volume(s)).

All volumes show healthy.

A little more setup information:

Mac Pro, 2 x 6 Core 3.46Hz Xeons, 64gb ECC RAM
FreeNAS-9.10.2-U6 (561f0d7a1)
7 x 3TB Toshiba drives in "primary" data pool in raidz2
5 x 2TB Seagate drives in "backup1" backup pool (externally attached) in raidz1
5 x 2TB Seagate drives in "backup2" backup pool (externally attached) in raidz1

My questions are:

Do these errors mean?
Is it likely there has been data loss?
What should be my next steps to prevent (further) data loss?

Thank you.

m0nkey_ · Nov 26, 2017

It appears that one of your external attached pool suffered a momentary disconnect. This could have been caused by power loss, bad cable or HBA.

Johnnie Black · Nov 26, 2017

If you have one try replacing the e-sata cable, e-sata is notoriously picky about cables, you're also using a port multiplier than can makes things even worse.

SwisherSweet · Nov 26, 2017

Thanks for the replies. I'll swap out the cables and see. I did notice that this started happening about a month ago and gets report on each scrub of that pool.

I ran a zpool status and all pools and there are no errors, so can I conclude there is a hardware issue only at this point?

Johnnie Black · Nov 26, 2017

SwisherSweet said:
so can I conclude there is a hardware issue only at this point?

Yes, it means there's was a CRC error during the transfer from the disk(s), when this happens there's a retry so your data is not affected, most times it means a bad SATA cable, also usually the SMART attribute 199 (UltraDMA CRC Error Count) will increase by one each time there's an error.

joeschmuck · Nov 26, 2017

Also know that attribute ID 199 will live with the hard drive forever so if you have a count of 28, this does not mean there is anything wrong with the drive based on your specific situation. Just note the fact that you have those errors and move on. Hopefully a new eSATA cable fixes your issue.

SwisherSweet said:
I ran a zpool status and all pools and there are no errors, so can I conclude there is a hardware issue only at this point?

No, you need to run the scrub. A zpool status is only a status message. Running a scrub will pass all that data through the new eSATA cable and then if you have no issues then you can call it fixed and point the finger at a bad cable.

Important Announcement for the TrueNAS Community.

CAM status: Uncorrectable parity/CRC error on 5 drives

SwisherSweet

Contributor

m0nkey_

MVP

Johnnie Black

Guru

SwisherSweet

Contributor

Johnnie Black

Guru

joeschmuck

Old Man

Similar threads

Important Announcement for the TrueNAS Community.

CAM status: Uncorrectable parity/CRC error on 5 drives

SwisherSweet

Contributor

m0nkey_

MVP

Johnnie Black

Guru

SwisherSweet

Contributor

Johnnie Black

Guru

joeschmuck

Old Man

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "CAM status: Uncorrectable parity/CRC error on 5 drives"

Similar threads