One or more devices has experienced an error

Status
Not open for further replies.

Starpulkka

Contributor
Joined
Apr 9, 2013
Messages
179
Wow, its amazin how fast memory goes bad. Im sure you did memtest and all other hardware test before you put any operating system on that machine. As many other sites windows bluescreen is windows fault not hardware fault lol.
I've been going through my data (mostly movies and tv) and it all seems to be working fine.
I can't find any corrupted data.
At this point i guessed that you had a hdd been offline and came back online, (have happened to me when i used intel x58 board). And memory errors have been only one bit there and there. Your picture looks like memory settings are totally wrong or memorys is totally shit/broken.

Edit: Have wondered what tipped other that do mem test now, and BAM my teddybear button eyes stopped at this
Permament errors in Dataset1 LOL

Code:
[root@freenas ~]# zpool status -v                                                                                           
  pool: vol1                                                                                                               
state: ONLINE                                                                                                             
status: One or more devices has experienced an error resulting in data                                                     
        corruption.  Applications may be affected.                                                                         
action: Restore the file in question if possible.  Otherwise restore the                                                   
        entire pool from backup.                                                                                           
   see: http://illumos.org/msg/ZFS-8000-8A                                                                                 
  scan: none requested                                                                                                     
config:                                                                                                                     
                                                                                                                            
        NAME                                            STATE     READ WRITE CKSUM                                         
        vol1                                            ONLINE       0     0     0                                         
          raidz1-0                                      ONLINE       0     0     0                                         
            gptid/0bcf83d0-5baa-11e4-96d2-10c37b4efe84  ONLINE       0     0     0                                         
            gptid/0c9f8ae9-5baa-11e4-96d2-10c37b4efe84  ONLINE       0     0     0                                         
            gptid/0d784f1f-5baa-11e4-96d2-10c37b4efe84  ONLINE       0     0     0                                         
                                                                                                                      
errors: Permanent errors have been detected in the following files:                                                         
                                                                                                                            
        vol1/dataset1: <0x7af3>                                                                                 
[root@freenas ~]#  
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Wow, its amazin how fast memory goes bad. Im sure you did memtest and all other hardware test before you put any operating system on that machine. As many other sites windows bluescreen is windows fault not hardware fault lol.

There's 3 guys that I'll never forget that have interesting stories with RAM problems:

1. A guy last year said that his RAM passed a memtestx86 test that ran over a weekend, but by Wednesday his pool and RAM were bad. He was a lucky one that he didn't lose any data because he hadn't gotten around to copying much data to the pool. It also put the fear of God in him and he went to ECC RAM. LOL!

2. A guy from 2012 had a great setup of backups with 2 FreeNAS machines and ZFS snapshot and replication. Everything was setup properly, except ECC RAM wasn't used. When the primary machine's RAM went bad it ended up killing the backups on the backup machine. If I remember correctly he didn't know how long his RAM had been bad, but he had some clues in that the box crashed and froze from time to time, but he hit the reset button on the machine and kept going.

3. One guy made a FreeNAS server and started copying photos from his camera's SD cards to the server. To verify the files weren't corrupt he happened to do MD5 checksums. Strangely, the MD5 sums of the files on the SD cards and MD5 sums created locally on the FreeNAS boxes didn't match. He posted because it didn't make sense. It turned out that his desktop's RAM was bad, so the pictures were being read from the SD card (corrupted in RAM, likely from the buffer) and then after being sent to the FreeNAS server ZFS happily stored them corrupted. :(

The reality of it is that RAM goes bad basically instantly. When reading and writing to RAM it's either good or not. It's shocking because we might all understand this is how it is, but seeing really is believing.
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
the box crashed and froze from time to time, but he hit the reset button on the machine and kept going.
Yikes...why bother with ZFS if you don't care that your server itself is unreliable?

Hardware errors? Ah, just reboot -- I'm sure we won't lose any data.
 
Joined
Mar 6, 2014
Messages
686
I know thit thread is a few months old, but i just had to say that i almost literally fell of my chair laughing about this one! :D
...but he hit the reset button on the machine and kept going.
ROFL

Almost laughed as much about this one as i did about the guy who said he had "irreplaceable, irrecoverable and extremely important" company data to store, but couldn't care less about backups :eek:
 
Last edited:

checksum

Cadet
Joined
Dec 1, 2016
Messages
5
To verify the files weren't corrupt he happened to do MD5 checksums.

Do you guys recommend verifying everything with checksum on the way to freenas/zfs? Usinc rsync for example.
Is there any way ECC ram can die just like the poor fellows ram? Would it halt the system instead of running on?
 
Status
Not open for further replies.
Top