CAM status: Uncorrectable parity/CRC error

Status
Not open for further replies.

solaris04

Cadet
Joined
Mar 13, 2018
Messages
4
Hi there,

My System is:
Build FreeNAS-11.1-U6

Platform Intel(R) Atom(TM) CPU C2758 @ 2.40GHz

Memory 16318MB ECC RAM


since Wednesday I've have received this error message 2 times:

Code:
freenas.local kernel log messages:

(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 00 88 f2 73 40 1b 00 00 01 00 00
(ada0:ahcich0:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada0:ahcich0:0:0:0): Retrying command
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 80 88 f3 73 40 1b 00 00 00 00 00
(ada0:ahcich0:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada0:ahcich0:0:0:0): Retrying command

-- End of security output --



smartmontools shows this:

Code:

root@freenas:~ # smartctl -a /dev/ada0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:	 WD Blue PC SSD
Device Model:	 WDC WDS500G1B0A-00H9H0
Serial Number:	165115423345
LU WWN Device Id: 5 001b44 4a6b6d988
Firmware Version: X41100WD
User Capacity:	500,107,862,016 bytes [500 GB]
Sector Size:	  512 bytes logical/physical
Rotation Rate:	Solid State Device
Form Factor:	  2.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:	Fri Nov  2 16:19:10 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
										was never started.
										Auto Offline Data Collection: Disabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(	0) seconds.
Offline data collection
capabilities:					(0x11) SMART execute Offline immediate.
										No Auto Offline data collection support.
										Suspend Offline collection upon new
										command.
										No Offline surface scan supported.
										Self-test supported.
										No Conveyance Self-test supported.
										No Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   2) minutes.
Extended self-test routine
recommended polling time:		(  10) minutes.

SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   ---	Old_age   Always	   -	   0
  9 Power_On_Hours		  0x0032   100   100   ---	Old_age   Always	   -	   12785
 12 Power_Cycle_Count	   0x0032   100   100   ---	Old_age   Always	   -	   37
165 Block_Erase_Count	   0x0032   100   100   ---	Old_age   Always	   -	   77530012
166 Minimum_PE_Cycles_TLC   0x0032   100   100   ---	Old_age   Always	   -	   1
167 Max_Bad_Blocks_per_Die  0x0032   100   100   ---	Old_age   Always	   -	   18
168 Maximum_PE_Cycles_TLC   0x0032   100   100   ---	Old_age   Always	   -	   5
169 Total_Bad_Blocks		0x0032   100   100   ---	Old_age   Always	   -	   331
170 Grown_Bad_Blocks		0x0032   100   100   ---	Old_age   Always	   -	   0
171 Program_Fail_Count	  0x0032   100   100   ---	Old_age   Always	   -	   0
172 Erase_Fail_Count		0x0032   100   100   ---	Old_age   Always	   -	   0
173 Average_PE_Cycles_TLC   0x0032   100   100   ---	Old_age   Always	   -	   2
174 Unexpected_Power_Loss   0x0032   100   100   ---	Old_age   Always	   -	   5
184 End-to-End_Error		0x0032   100   100   ---	Old_age   Always	   -	   0
187 Reported_Uncorrect	  0x0032   100   100   ---	Old_age   Always	   -	   0
188 Command_Timeout		 0x0032   100   100   ---	Old_age   Always	   -	   22
194 Temperature_Celsius	 0x0022   071   071   ---	Old_age   Always	   -	   29 (Min/Max 19/71)
199 UDMA_CRC_Error_Count	0x0032   100   100   ---	Old_age   Always	   -	   0
230 Media_Wearout_Indicator 0x0032   100   100   ---	Old_age   Always	   -	   0x0240001c0240
232 Available_Reservd_Space 0x0033   100   100   004	Pre-fail  Always	   -	   100
233 NAND_GB_Written_TLC	 0x0032   100   100   ---	Old_age   Always	   -	   1366
234 NAND_GB_Written_SLC	 0x0032   100   100   ---	Old_age   Always	   -	   10041
241 Total_Host_GB_Written   0x0030   253   253   ---	Old_age   Offline	  -	   9382
242 Total_Host_GB_Read	  0x0030   253   253   ---	Old_age   Offline	  -	   55527
244 Temp_Throttle_Status	0x0032   000   100   ---	Old_age   Always	   -	   0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline	Completed without error	   00%	 12758		 -
# 2  Short offline	   Completed without error	   00%	 12758		 -

Selective Self-tests/Logging not supported



zpool status gives me this

Code:
root@freenas:~ # zpool status -v
  pool: data
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:13:31 with 0 errors on Tue Oct  9 04:13:31 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		data											ONLINE	   0	 0	 0
		  mirror-0									  ONLINE	   0	 0	 0
			gptid/96175e2b-209f-11e8-b60c-0cc47a6ad1ac  ONLINE	   0	 0	 0
			gptid/96a91bf8-209f-11e8-b60c-0cc47a6ad1ac  ONLINE	   0	 0	 0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:21 with 0 errors on Thu Nov  1 03:45:21 2018
config:

		NAME		STATE	 READ WRITE CKSUM
		freenas-boot  ONLINE	   0	 0	 0
		  ada2p2	ONLINE	   0	 0	 0

errors: No known data errors



Is this caused by a bad sata cable or even worse, is the ssd of the mirror about to fail soon.
Due to noise issues and not having a spare room for the small server I cannot use a hdd instead of ssds for the moment.

Looking forward to hear from you.
Thank you!
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
First thing to do is to replace the SATA cable, that's the likely culprit.
Agreed. The SMART output doesn't seem to have any indicators that suggest it's a drive problem.

Side note; is there a 6Gbps SATA port you can connect this SSD to? It's currently connected at 3Gbps. I also find it neat that it's reporting SMART attributes for the data written directly to both the TLC and SLC (this SSD apparently has a separate chunk of NAND that's provisioned in "SLC mode" according to the datasheet)
 
Status
Not open for further replies.
Top