Correct procedure to replace a defective HD

Status
Not open for further replies.

luisvale

Dabbler
Joined
Mar 1, 2012
Messages
32
Hi all

I've seen several threads on the subject, but some questions still linger..

I'm running FreeNAS 8.04 -x64 (10351) and on one my vdev's I was getting the following error:

May 24 15:10:55 Fiona smartd[1558]: Device: /dev/ada1, 1 Currently unreadable (pending) sectors


Despite this error zpool status is still giving me all drives of the Z1 raid as online.


I managed to identify the drive physically and as it was still on warranty from Western Digital I asked them for an advanced replacement. So WD sends out a new drive and I have 30 days to return the old one.

I went for the current manual on sections 6.3.9 Replacing a Failed Drive and 6.3.10 Hot Swapping a ZFS Failed Drive and thats where my questions begin.

I can not have the replacement and the new drive connected physically to the server simultaneously (all SATA ports are taken) but from the manual (6.3.9) I get the impression that both the old and new drives must be connected at the same time.

Shutting down the server for maintenance like this is not a problem so I believe that in this case (replacing a drive for a new one) I must follow the procedures set by 6.3.10.

Am I thinking wrong, Is something missing ?? Any tips or hints to proceed ? Or is it simply to press the Replace button on the GUI and follow the "music" ??

Thanks in advance

Luis
 

Erwin

Dabbler
Joined
Sep 21, 2011
Messages
30
Hi all

I can not have the replacement and the new drive connected physically to the server simultaneously (all SATA ports are taken) but from the manual (6.3.9) I get the impression that both the old and new drives must be connected at the same time.

Shutting down the server for maintenance like this is not a problem so I believe that in this case (replacing a drive for a new one) I must follow the procedures set by 6.3.10.

Am I thinking wrong, Is something missing ?? Any tips or hints to proceed ? Or is it simply to press the Replace button on the GUI and follow the "music" ??

Thanks in advance

Luis

I successfully did it with FreeNAS-8.0.4-RELEASE_MULTIMEDIA-p1-x64 (11076). The manual was not really useful in my eyes.

1. Identify defective disk: To get all this informations, use for example smartctl -a /dev/ada3 <--ada3 was the problem disk in my case
2. Shut down the NAS
3. Replaced the problem disk against the new one.
4. Reboot the NAS again.
5. In the GUI, do a View All Volumes (was degraded as expected) and, for this volume, a View Disks (displaying the removed disk with a Serial "unknown"). The other disks, ada0, ada1, ada2, ada4, ada5... are still displayed with a real Serial.
6. Select exactly this "Unknown" entry (the last shadow of the died, and removed disk) and do a Replace. In the pop-up window it says "Replacing disk NONE and it offers as Member Disk "in-place (ada3) 3.0TB))" <-- this is the new disk. After this, the new disk appeared in the list as ada3 with the new Serial. The old entry is still here.
7. The system starts automatically the resilvering process. Have a look on the Zpool in the CLI: [root@freenas] ~# zpool status . Will be in state degraded, resilvering
8. Without waiting until resilvering is done (can take a day, depending on the used amount of the disk size): Do, in the GUI, a "Detach" on the entry of the old disk (Serial "unknown").
9. Have a look on the Zpool in the CLI: [root@freenas] ~# zpool status . Will be in state online, resilvering

This took, without the resilvering process, not more than 15 minutes for me, strait on forward in the GUI.

erwin
 

Kimba

Dabbler
Joined
Feb 3, 2012
Messages
36
No they don't need to be connected at the same time:

1. Replace the tired / old / dead hard drive with the new one.
2. Boot the machine and log into the FreeNAS web gui then you will see the yellow caution thingy in the upper right corner.
3. Select Storage and click on the offending pool.
4. Click on the edit of the offending drive and replace it with the new drive (drop down I believe).\
5. Remove (via Web Gui) the old drive -- and it will automatically resilver the new drive.
6. After about 20 or so minutes the yellow lighty thingy turns green.

or just follow this: http://www.freenas.org/images/resou...8.0.3_guide.html#__RefHeading__9510_322745209
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
The old failing drive should be *detached* BEFORE you shutdown and remove it!

I've seen many problems arise here because that wasn't done.

Other than that, Erwin instructions should be ok.

So step #8 should happen after step #1
 

luisvale

Dabbler
Joined
Mar 1, 2012
Messages
32
Thanks Kimba and Erwin

I believe we really need to not RTFM.......

Seriously, Documentation not only here but in several other areas needs to be rewritten to make it "clear"


ProtoSD-thanks


Best regards
 

luisvale

Dabbler
Joined
Mar 1, 2012
Messages
32
:)

The problem with documentation on this kind of products is that IMHO people that write the docs must be very close to the dev team. If a user tries to write documentation without knowing what is the reasoning on certain choices of the software docs will be confusing and misleading.

I've seen how it is done on the larger software maker in the world for more than 10 years......and even so sometimes (many times) docs would not be ok....
 
Status
Not open for further replies.
Top