CAM status: Uncorrectable parity/CRC error on 5 drives

Status
Not open for further replies.

SwisherSweet

Contributor
Joined
May 13, 2017
Messages
139
Hi,

I'm getting CAM status: Uncorrectable parity/CRC error on 5 drives according to my dialy security output run. I appreciate any help and advice on how to proceed next to keep my data protected.

(ada5:mvsch1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 18 4e 63 40 9a 00 00 00 00 00
(ada6:mvsch1:0:1:0): WRITE_FPDMA_QUEUED. ACB: 61 a8 98 59 63 40 9a 00 00 00 00 00
(ada8:mvsch1:0:3:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 c0 4d 63 40 9a 00 00 00 00 00
(ada7:mvsch1:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 00 c0 4c 63 40 9a 00 00 01 00 00
(ada9:mvsch1:0:4:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 e8 50 63 40 9a 00 00 00 00 00
(ada5:mvsch1:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:mvsch1:0:1:0): CAM status: Uncorrectable parity/CRC error
(ada5:(ada6:(ada7:mvsch1:0:2:0): CAM status: Uncorrectable parity/CRC error
mvsch1:0:(ada8:mvsch1:0:3:0): CAM status: Uncorrectable parity/CRC error
1:(ada7:0): mvsch1:0:Retrying command
2:(ada8:0): mvsch1:0:Retrying command
3:0): (ada9:mvsch1:0:4:0): CAM status: Uncorrectable parity/CRC error
Retrying command
(ada9:mvsch1:0:mvsch1:0:4:0:0): 0): Retrying command
Retrying command
(ada9:mvsch1:0:4:0): WRITE_FPDMA_QUEUED. ACB: 61 f8 b8 bd 12 40 9f 00 00 00 00 00
(ada7:mvsch1:0:2:0): WRITE_FPDMA_QUEUED. ACB: 61 d8 90 ba 12 40 9f 00 00 00 00 00
(ada5:mvsch1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 c0 bd 12 40 9f 00 00 00 00 00
(ada6:mvsch1:0:1:0): WRITE_FPDMA_QUEUED. ACB: 61 f8 70 c2 12 40 9f 00 00 00 00 00
(ada8:mvsch1:0:3:0): WRITE_FPDMA_QUEUED. ACB: 61 f8 b8 bd 12 40 9f 00 00 00 00 00
(ada9:mvsch1:0:4:0): CAM status: Uncorrectable parity/CRC error
(ada7:mvsch1:0:2:0): CAM status: Uncorrectable parity/CRC error
(ada9:(ada7:mvsch1:0:mvsch1:0:4:2:0): 0): Retrying command
Retrying command
(ada8:mvsch1:0:3:0): CAM status: Uncorrectable parity/CRC error
(ada8:mvsch1:0:(ada6:mvsch1:0:1:0): CAM status: Uncorrectable parity/CRC error
3:(ada6:0): mvsch1:0:Retrying command
1:0): (ada5:mvsch1:0:0:0): CAM status: Uncorrectable parity/CRC error
Retrying command
(ada5:mvsch1:0:0:0): Retrying command

Here are the errors summarized:

(ada5:mvsch1:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:mvsch1:0:1:0): CAM status: Uncorrectable parity/CRC error
(ada7:mvsch1:0:2:0): CAM status: Uncorrectable parity/CRC error
(ada8:mvsch1:0:3:0): CAM status: Uncorrectable parity/CRC error
(ada9:mvsch1:0:4:0): CAM status: Uncorrectable parity/CRC error

I have verified that all of these drives are in the same external eSATA enclosure (either backup1 or backup2... hard to know since FreeNAS doesn't tell which drives map to which volume(s)).

All volumes show healthy.

A little more setup information:
  • Mac Pro, 2 x 6 Core 3.46Hz Xeons, 64gb ECC RAM
  • FreeNAS-9.10.2-U6 (561f0d7a1)
  • 7 x 3TB Toshiba drives in "primary" data pool in raidz2
  • 5 x 2TB Seagate drives in "backup1" backup pool (externally attached) in raidz1
  • 5 x 2TB Seagate drives in "backup2" backup pool (externally attached) in raidz1
My questions are:
  1. Do these errors mean?
  2. Is it likely there has been data loss?
  3. What should be my next steps to prevent (further) data loss?

Thank you.
 
Last edited:
Joined
May 10, 2017
Messages
838
If you have one try replacing the e-sata cable, e-sata is notoriously picky about cables, you're also using a port multiplier than can makes things even worse.
 

SwisherSweet

Contributor
Joined
May 13, 2017
Messages
139
Thanks for the replies. I'll swap out the cables and see. I did notice that this started happening about a month ago and gets report on each scrub of that pool.

I ran a zpool status and all pools and there are no errors, so can I conclude there is a hardware issue only at this point?
 
Joined
May 10, 2017
Messages
838
so can I conclude there is a hardware issue only at this point?

Yes, it means there's was a CRC error during the transfer from the disk(s), when this happens there's a retry so your data is not affected, most times it means a bad SATA cable, also usually the SMART attribute 199 (UltraDMA CRC Error Count) will increase by one each time there's an error.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Also know that attribute ID 199 will live with the hard drive forever so if you have a count of 28, this does not mean there is anything wrong with the drive based on your specific situation. Just note the fact that you have those errors and move on. Hopefully a new eSATA cable fixes your issue.

I ran a zpool status and all pools and there are no errors, so can I conclude there is a hardware issue only at this point?
No, you need to run the scrub. A zpool status is only a status message. Running a scrub will pass all that data through the new eSATA cable and then if you have no issues then you can call it fixed and point the finger at a bad cable.
 
Status
Not open for further replies.
Top