Shouldn't ZFS/FreeNAS deal with bad sectors for me?

Status
Not open for further replies.

SwisherSweet

Contributor
Joined
May 13, 2017
Messages
139
Hi,

I got two CRITICAL alerts today from FreeNAS:
  • CRITICAL: July 11, 2017, 2:03 a.m. - Device: /dev/ada17, 8 Currently unreadable (pending) sectors
  • CRITICAL: July 11, 2017, 2:33 a.m. - Device: /dev/ada17, 2 Offline uncorrectable sectors
The pool in which this drive is located shows HEALTHY.

Based on my research in this forum and abroad, it appears the drive has some bad sectors. However, many of the posts suggest that ZFS/FreeNAS can't really deal with this and one must us dd to write data to these bad sectors for it to reallocate data. Now these posts were pretty old, so I'm hoping that is no longer the case.

Does FreeNAS 9.10.2 handle this type of problem now? Or, does one have to constantly check for, and manually deal with bad sectors each time one or more pop-up? If the answer is the later, I'll be confounded since I never had to deal with this with my Drobo. Perhaps the Drobo just showed the drive as bad when it encounter a bad sector... I don't know. But I was never asked to locate and deal with bad sectors myself.
 
Last edited:

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Drives automatically remap bad blocks when an attempt is made to write to them. There is not much they can do when there is a read error, because there is no way to tell what the original data in the block was.

If the read error happened when the drive was running a self-test, ZFS is not notified. Run a scrub regularly, and if a block being used by ZFS went bad, it will detect it and notify you. The pool status will change to degraded, and the drive should be replaced. Or skip the middle steps and replace the drive now.

Non-ZFS NAS systems often don't bother the user when their data has gone bad. Usually this is because they have no way of detecting that data has gone bad. At a lower level, the drive is still in charge of remapping bad blocks as above.
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
Bad sectors aren't an immediate issue. Any data on a bad sector can be assumed to be destroyed, but ZFS should protect you via redundancy. Attempting to write to a pending sector will be detected and that location marked as bad, it'll never be written to again, and the data will be written somewhere else. You can do this manually, or just live with the warning and let the sector resolve on its own at some point in the future.

All that said, bad sectors can be a warning sign that the drive is about to fail. My personal experience is that it doesn't indicate anything like an immediate failure, but multiple sectors popping up consistently is not a good sign. In that case it'd be advisable to replace the drive before it fails. The thinking here is that replacing sooner increases the chances that you complete the resilver before another drive fails for some other reason.

If the read error happened when the drive was running a self-test, ZFS is not notified. Run a scrub regularly, and if a block being used by ZFS went bad, it will detect it and notify you. The pool status will change to degraded, and the drive should be replaced. Or skip the middle steps and replace the drive now.
A bad sector will not immediately lead to a degraded pool. If the bad sector did indeed lead to a checksum error during a scrub it would be healed and noted in the "zpool status" output.

I agree though, the Drobo didn't report the bad sectors, but that's because they were either ignored or treated as a drive error that indicated the need for a replacement.
 

SwisherSweet

Contributor
Joined
May 13, 2017
Messages
139
Thanks for the replies. During a scrub does FreeNAS replicate the data that used to be in the bad sectors using the data in the raidz?

Also, since I bought these drives just a few months ago, shouldn't I be able to RMA due to bad sector(s)?


Sent from my iPhone using Tapatalk
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
The bad sectors may not even be in use yet.

This stuff is all about knowing about issues which will help you avoid data loss in future.

This could indicate a failing drive, or it could be a one off.

If the drive is in warranty, replace it.

If not, you can deal with it by either ignoring, doing something, or replacing, or waiting for the situation to get worse
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The bad sectors may not even be in use yet.
This is the key point. If your system had written data to a bad sector, and found it was bad when it later tried to read that data (like during a scrub), it would fix the data using redundancy and raise an error. But the disks' SMART code often finds bad sectors that the system hasn't tried to use yet, or automatically covers them.

I don't trust the school of thought that has you trying to force a write to the bad sector, to force the disk to remap it. OTOH, a handful of bad sectors (i.e., in the low single digits) doesn't necessarily bother me a great deal. If your da17 were a disk that I had had for some time, I wouldn't feel compelled to immediately replace it--but if it were fairly new and still under warranty, I probably would.
 
Status
Not open for further replies.
Top