Data Corruption in OS area

Status
Not open for further replies.

keithg

Explorer
Joined
May 15, 2013
Messages
92
Running FreeNAS for a couple years now. It is built on a Dell 390 workstation. I have 6Gb ECC RAM and have had no major issues to date. I have had a few minor problems and it was suggested that I go to 8Gb from 6Gb. It is a long story as to how hard it has been to find ram for this machine, but I finally received some and put it in yesterday. Powered down, pulled plug, removed the 2 sticks, replaced to bring it to 8Gb. Post shows 8Gb. Boot FreeNAS and it almost boots then has a kernel panic. Tried it again. same result. Powered down replaced the 2 sticks and replaced it with what I had previously (back to 6Gb). it boots up! Yea! now I have a critical error, though:
CRITICAL: The volume first_NAS (ZFS) state is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.
Ran zpool status -v and was initially told that there was an error in file '0x0'. Ran a scrub and now it says:

~# zpool status -v
pool: first_NAS
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 448K in 1h37m with 8 errors on Mon May 4 22:18:39 2015
config:

NAME STATE READ WRITE CKSUM
first_NAS ONLINE 0 0 8
raidz1-0 ONLINE 0 0 16
gptid/acf0de37-9dd9-11e2-b519-001aa0149149 ONLINE 0 0 4
gptid/adf8a5c5-9dd9-11e2-b519-001aa0149149 ONLINE 0 0 2
gptid/aea9f082-9dd9-11e2-b519-001aa0149149 ONLINE 0 0 1

errors: Permanent errors have been detected in the following files:

/var/db/system/cores/python2.7.core

What is the best way to repair this? It doe snot appear that any of my data has been affected. Backup and wipe? Can/should I reinstall Freenas from scratch then use my config to restore the pool? It appeqars to be in the OS area and may affect any updated (python). There is also an update for it, but I have not clicked on it until I get this error resolved.

Regards,

Keith​
 
Last edited:

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I think the best way to fix the problem is to get a motherboard that was not build in 2005. :) You almost certainly bought RAM that is not compatible with this workstation, resulting in the kernel panic. A quick view of the prices available from trustable vendors for this type of RAM indicates that you would be spending more on RAM for this machine than the machine itself would be worth.

Maybe someone else can comment, but I think you'll be fine if you simply reinstall the FreeNAS and restore your config. We don't recommend running with 6GB. It could be a ticking time bomb.
 

keithg

Explorer
Joined
May 15, 2013
Messages
92
Actually, I poked around a bit. The only file found in 2 scrubs was that file listed above which is a core file from the failed boot. I deleted the file and scrubbed again. This time it looks like normal, no errors. I rebooted and it comes up clean with no alerts or warnings.

I'll test any memory on another machine before shutting down this server to install memory...

Thanks!

Keith
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
When installing hardware, you're supposed to burn in your system again. This specifically includes memory.
 

wpirobotbuilder

Dabbler
Joined
Apr 21, 2014
Messages
16
I just encountered this same error on my system, but with 96GB of ECC memory (valid based on the QVL for the motherboard and memory) and a 6TB raw dataset, and no kernel panic. Tested the memory when I installed the system (more than a year ago), and will test it again now to check. Am not experiencing any visible data corruption, but of course that doesn't mean very much.

[root@freenas] ~# zpool status -v | less
pool: Backups
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 0 in 5h39m with 1 errors on Sun Jul 5 05:39:51 2015
config:

NAME STATE READ WRITE CKSUM
Backups ONLINE 0 0 1
mirror-0 ONLINE 0 0 2
gptid/63ff8d8f-f805-11e3-8ff2-14dae9bf9414 ONLINE 0 0 2
gptid/645581ec-f805-11e3-8ff2-14dae9bf9414 ONLINE 0 0 2

errors: Permanent errors have been detected in the following files:

/var/db/system/cores/python2.7.core

pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Tue Jul 21 03:45:10 2015
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
da0p2 ONLINE 0 0 0

errors: No known data errors
[root@freenas] ~#
 
Last edited:

wpirobotbuilder

Dabbler
Joined
Apr 21, 2014
Messages
16
Looks like one of my DIMMs went bad, it failed after the first few tests. Fortunately my board has memory error indicators which made it easy to find. I took it out and Memtest passed two runs (48+ hours).
 
Joined
Oct 2, 2014
Messages
925
Good to hear you got it resolved.
 
Status
Not open for further replies.
Top