ZFS pool cksum

Rolfieo

Cadet
Joined
Jul 16, 2013
Messages
5
My sytemboard crashed, so I ordered a new one. It’s a supermicro X10SRL-F with a xeon and ECC 128GB memory.

I retrieved the FreeNAS database from my old USB stick, and I have installed FreeNAS and restorored the backup.

After a reboot the pools existed. Only there where errors on the POOL and volume

Code:
NAME                                            STATE     READ WRITE CKSUM
    datavolume                                      DEGRADED     0     0   106
      raidz2-0                                      DEGRADED     0     0   212
        gptid/2200e743-cc9e-11e3-b527-60a44ccf9647  DEGRADED     0     0     0  too many errors
        gptid/59cd44cd-cf7e-11e6-aab1-60a44ccf9647  DEGRADED     0     0     0  too many errors
        gptid/8e7f5e12-2a5e-11e7-b0fa-60a44ccf9647  DEGRADED     0     0    17  too many errors
        gptid/241f097d-cc9e-11e3-b527-60a44ccf9647  ONLINE       0     0    32
        gptid/afd88b75-9009-11e7-9a28-60a44ccf9647  DEGRADED     0     0     0  too many errors
        gptid/98949ee9-2735-11e7-9bf4-60a44ccf9647  DEGRADED     0     0    10  too many errors


The datavolume is degraded and the raidz2 is also degraded.
I have rebooted the FreeNAS and CKDSUM on datavolume and raidz2 increased every few second.

Code:
NAME                                            STATE     READ WRITE CKSUM 
    datavolume                                      DEGRADED     0     0   171
      raidz2-0                                      DEGRADED     0     0   342
        gptid/2200e743-cc9e-11e3-b527-60a44ccf9647  ONLINE       0     0     0 
        gptid/59cd44cd-cf7e-11e6-aab1-60a44ccf9647  DEGRADED     0     0     0  too many errors
        gptid/8e7f5e12-2a5e-11e7-b0fa-60a44ccf9647  DEGRADED     0     0     0  too many errors
        gptid/241f097d-cc9e-11e3-b527-60a44ccf9647  ONLINE       0     0     0
        gptid/afd88b75-9009-11e7-9a28-60a44ccf9647  DEGRADED     0     0     0  too many errors
        gptid/98949ee9-2735-11e7-9bf4-60a44ccf9647  DEGRADED     0     0    10  too many errors



I have changed the SATA cabled. Made no difference. Used different SATA controller (with different cables) no change. Created a new ZFS pool with 4 SSDs, worked without any issues.
When the disks are connection, the CKSUM on the HDD are not increasing, but the CKSUM on datavolume and RAIDZ2 are increasing steady….


Conclusion, cables, controller are working without any issues but the POOL really has the issue.
I have a backup of most of my data, but not all. So, I want to keep my data is possible.

What would be the best action plan to try to save my zpool data?
 
Joined
Jul 3, 2015
Messages
926
Doesn't look good buddy.

Tell us a bit more about the crash and what mobo and RAM you were using and the system specs in general.
 

Rolfieo

Cadet
Joined
Jul 16, 2013
Messages
5
I was a systemboard that crashed. Did something very stuppid.... placed a PCIe card while the power was not removed from the system. After that the computer never booted after the BIOS. Tried a lot, but just give op.

I ordered this new systemboard as I had the CPU and memory. Maybe in the future i will add ESXi as hypervisor and add PCI passthrough Controllers for FreeNAS.
Something for the future and not now. I just want to have the system operational again.

But the interesting one is... Its now working fine. And I did not do a lot......

I did delete 2 syslog files that where corrupted.

And a a certain time.. When i started the machine again, the pool was fully online. The CKSUM also did not increase anymore. What i exactly did, i don't know. But i do remember that i deleted the files. Syslog was using those corrupted files, and i think that explains the increase of CKSUM error.

After that, i executed a scrub of the pool. That took about 8-12 hours and had a couple of errors on only one of the disks. But no data loss. Only a few errors on 1 disk.

At the moment its working, and i'm backup the last data to my second FreeNAS at my parents home.

After that i want to change it back to the onboard SATA controllers and see if its still working fine. Same for long smart scan on all the disks and a complete scrub again.
And i'm thinking to just delete the POOL and create it later again. As my current FreeNAS also has issues with the database. But that could also be because i restored the database on a newer version of FreeNAS. Now i have all kind of strange errors on the GUI.

But first backup everything..... Then the rest....

But i still can't explain the issue. That is the strange.....

Could the access of the corrupted files, increase the cksum count?
But then still why did it have all the errors. That is the question....
 

Rolfieo

Cadet
Joined
Jul 16, 2013
Messages
5
Small update.....

No issues so far. Everything is restored and working without any issues.

To be sure... I have installed FreeNAS clean, and make a extra backup of the data and then destroyed the volume, recreated with, and it work now without any issues.
 
Top