SwisherSweet
Contributor
- Joined
- May 13, 2017
- Messages
- 139
I had a drive fail, and it went OFFLINE in my backup1 pool. I replaced the drive using the procedure found in this forum, where I power down, remove old drive, power up, choose REPLACE, and the drive resilvered and the volume shows HEALTHY.
As I was waiting to get my new drive in the mail, I stated to see messages about "data corruption":
After replacing the drive, I still get this error:
I am running a SCRUB on backup1 now that the resilvering process is complete.
My FreeNAS server has been running great for about 1 year, and so far I haven't had any data issues, or messages or about data loss until now.
But I am wondering:
A little more setup information:
I found this on github, but not sure it's relevant to my situation:
https://github.com/zfsonlinux/zfs/issues/3256
It appears when the drive failed, data was corrupted. However, it is my understanding that the nature of ZFS is to prevent such things. Doens't ZFS assure each write? Plus, with RAIDZ isn't that write effectively duplicated? I just don't see how a failed disk would cause data corruption with ZFS and FreeNAS.
As I was waiting to get my new drive in the mail, I stated to see messages about "data corruption":
Code:
Device: /dev/ada4, ATA error count increased from 0 to 41 Device: /dev/ada4, 19 Currently unreadable (pending) sectors The volume backup1 (ZFS) state is DEGRADED: One or more devices has experienced an error resulting in data corruption. Applications may be affected. Device: /dev/ada4, Self-Test Log error count increased from 0 to 1 Device: /dev/ada4, 19 Offline uncorrectable sectors Device: /dev/ada4, unable to open device
After replacing the drive, I still get this error:
Code:
CRITICAL: Feb. 11, 2018, 7:33 p.m. - The volume backup1 (ZFS) state is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.
I am running a SCRUB on backup1 now that the resilvering process is complete.
My FreeNAS server has been running great for about 1 year, and so far I haven't had any data issues, or messages or about data loss until now.
But I am wondering:
- Why am I still getting an error message after replacing the failed drive?
- Is there really data corruption? If so, why?
- Since the volume it is complaining about is `backup1`, which is just a bunch of snapshots of my main data, shouldn't I be able to reconcile this data some way?
- Shouldn't the system be able to replace a failed drive without data corruption? I don't understand what could have gone wrong. The data corruption messages occurred just before I replaced the failed drive (and after), and I run weekly scrubs of my data.
A little more setup information:
- Mac Pro, 2 x 6 Core 3.46Hz Xeons, 64gb ECC RAM
- FreeNAS-9.10.2-U6 (561f0d7a1)
- 7 x 3TB Toshiba drives in "primary" data pool in raidz2
- 5 x 2TB Seagate drives in "backup1" backup pool (externally attached) in raidz1
- 5 x 2TB Seagate drives in "backup2" backup pool (externally attached) in raidz1
I found this on github, but not sure it's relevant to my situation:
https://github.com/zfsonlinux/zfs/issues/3256
It appears when the drive failed, data was corrupted. However, it is my understanding that the nature of ZFS is to prevent such things. Doens't ZFS assure each write? Plus, with RAIDZ isn't that write effectively duplicated? I just don't see how a failed disk would cause data corruption with ZFS and FreeNAS.
Last edited: