CHKSUM errors on scrub

Status
Not open for further replies.

Chadim

Cadet
Joined
Aug 12, 2013
Messages
2
Hi all,

I am getting a few chksum errors on scrubing a relatively new (2 months) 8x3TB WD Red setup
with freenas.local 8.3-RELEASE-p7:

NAME STATE READ WRITE CKSUM
fv_data ONLINE 0 0 0
gptid/5035344e-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 0
gptid/5092def4-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 1 (repairing)
gptid/50f28d83-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 1 (repairing)
gptid/5154a9db-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 2 (repairing)
gptid/51b8918d-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 1 (repairing)
gptid/521bcdcc-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 1 (repairing)
gptid/52805c3b-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 0
gptid/52e4f888-dd2c-11e2-ba93-bc5ff485fca0 ONLINE 0 0 3 (repairing)


self-tests on all drives show up nothing, operation otherwise seems perfect. I have now scrubbed
three times, and each time at least 6 drives show up a CKSUM error (not always the same, mind you. all 8 drives have shown an error at least once). Each time zpool informs me that all errors have been corrected.
Any suggestions? I am a bit reluctant to trash 8 almost new drives.

Best,
Chadim
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I have now scrubbed three times, and each time at least 6 drives show up a CKSUM error (not always the same, mind you. all 8 drives have shown an error at least once)
Test your RAM.
 

Chadim

Cadet
Joined
Aug 12, 2013
Messages
2
Thank you very much sir, that was exactly the problem. Memtest86 yields errors - now to identify the bad module...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
After you identify and replace the bad module you should do one final memory test to ensure all is well, then do another scrub. Likely you will get some errors as the scrub progresses, but that is because stuff is being fixed that was corrupted in RAM. (Insert plug to upgrade to an ECC based system to protect your data in the future...) Then, do a zpool clear to erase the error messagesand do one last scrub. You should get zero errors on the second scrub, indicating that all is going well again.
 
Status
Not open for further replies.
Top