Issues after replacing a bad drive (FreeNAS 8.0.2-RELEASE)

Status
Not open for further replies.

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Hi all!

Some days ago my NAS fell down about 50 cm to the floor (due to a not properly mounted wall board - stupid thing) while I was accessing data on it.
Luckily just one HDD died completely (click of death) and I had some spare drives left. I tried replacing the hard disk via the GUI first, but that yielded in some big error about a String which couldn't be parsed correctly.
So I went on to the CLI and followed the inofficial FAQ and the official FAQ.

First I removed the bad drive, put in the replacement at the same sata port, and began the resilvering process. It completed successfully after ~15 hours.
Now my pool status is HEALTHY again, but there are some read errors on two other drives:
Code:
  pool: tank1
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 15h50m with 0 errors on Sat Dec 17 04:09:18 2011
config:

	NAME                                            STATE     READ WRITE CKSUM
	tank1                                           ONLINE       8     0     0
	  raidz1                                        ONLINE       8     0     0
	    gptid/9c4114e4-9143-11e0-820f-485d6094a16a  ONLINE       0     0     0
	    gptid/9cf38c6a-9143-11e0-820f-485d6094a16a  ONLINE       4     0     0
	    ada2                                        ONLINE       0     0     0  1.17T resilvered
	    gptid/9e389f0f-9143-11e0-820f-485d6094a16a  ONLINE       0     0     0
	    gptid/9efe636a-9143-11e0-820f-485d6094a16a  ONLINE       0     0     0
	    gptid/9fc77be7-9143-11e0-820f-485d6094a16a  ONLINE       4     0     0

errors: No known data errors


I'm getting following smart errors:
Code:
Dec 17 10:41:28 freenas smartd[3181]: Device: /dev/ada1, 5 Currently unreadable (pending) sectors
Dec 17 10:41:29 freenas smartd[3181]: Device: /dev/ada6, 4 Currently unreadable (pending) sectors

So probably those drives also are slightly damaged (they correspond to the 2 drives with read errors from the zpool report). Can't the drives just flag those sectors as damaged and ignore them in future? Or do I need to replace them as well? Should I just clear the errors and scrub again?

Also the name of the new drive is ada2 instead of the unique ID (probably because I typed zpool replace tank1 <oldid> ada2 for the resilvering to start). In the GUI it say {serial}... instead of {uuid}.., see attached screenshot.
freenas-smart-id.jpg

So the two main quesions which bug me:
  • What to do about the id of the new drive?
  • Do I need to replace the other two disks as well?

Thanks for your help in advance! The FAQ's and other threads already helped a lot :)

[edit]
After looking around in the bug reports, I found this ticket: https://support.freenas.org/ticket/744
It addresses the ID issue, and according to the ticket it just seems to be a cosmetic issue.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,995
Glad to see you were able to replace the drive without any real problem.
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
I'm very glad, too :)

Nobody has a recommendation about the bad sectors?
I'll probably try to run the WD Diagnostic Tools if there is no easy way to flag those sectory from within FreeBSD/FreeNAS. So far I had no other problems with the sectors, but if data gets written on them ZFS probably has to do error correction all the time (not sure if that's a big issue, though?) - so better to flag them as defect.

Just want to avoid the downtime and effort for another two resilvering processes if possible ;)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,995
In the old days you could mark sectors using the manufacturer tools or through your own coding, I would think using the WD tool you have indicated, Extended Test would give the results you desire. I would also generate a report because too many sector errors in a general location is a sure indication that trouble is right down the road.
 
Status
Not open for further replies.
Top