SOLVED How to replace the drive in Freenas 8.04?

Status
Not open for further replies.

benjy23

Dabbler
Joined
Jun 3, 2011
Messages
21
What if I just restored the system to before I pulled out the old HD? The supposed problematic 1?
 
G

gcooper

Guest
Well this is where I believe you went wrong to begin with, you should have offlined the bad disk before swapping it with the new one. Don't swap it back now!

The process that I remember that works is:
Code:
swapoff <old-disk>p2 (if this doesn't work and there's a magic integer in its place, you !need! to reboot now to avoid eventual kernel panics)
zpool replace <pool> <old-id-or-device> <new-device>
zpool offline <pool> <old-id-or-device>
reboot
zpool detach <pool> <old-id-or-device>


Once zfsd is available and fully functional, hotswap capable hardware won't require the (reboot) step.
 

benjy23

Dabbler
Joined
Jun 3, 2011
Messages
21
Thanks Gcooper.
However does it mean I have to have the old and new HD plugged in at once? Cos it looks like it.
If that's the case I don't have enough Sata ports...
 
G

gcooper

Guest
Thanks Gcooper.
However does it mean I have to have the old and new HD plugged in at once? Cos it looks like it.
If that's the case I don't have enough Sata ports...

What you'll need to do then (in effect) is something like this:

Code:
swapoff <old-disk>p1
zpool offline <pool> <old-id-or-device>
power off machine
attach new disk
power on machine
add drive as hotspare of <pool> in the GUI
zpool replace <pool> <old-id> <new-device>p2
zpool detach <pool> <old-id>
 

stuom

Dabbler
Joined
Jan 12, 2012
Messages
10
I am having an exact same situation:
[root@freenas] ~# zpool status
pool: tank
state: DEGRADED
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
ada6 ONLINE 0 0 0
replacing DEGRADED 0 0 0
8275006482299000461 UNAVAIL 0 0 0 was /dev/ada0/old
ada1 ONLINE 0 0 0
ada2 ONLINE 0 0 0
ada5 ONLINE 0 0 0
ada3p2 ONLINE 0 0 0
ada4 ONLINE 0 0 0

errors: No known data errors

I had a 6*1 TB raidz array with one disk UNAVAIL and another showing SMART errors. So I powered FreeNAS down, switched the UNAVAIL disk to a new one, executed
zpool replace ada3
Freenas replaced the disk and after that all disks as well as pool were ONLINE.
I then powered it down again, swithced the other disk and zpool replaced it.
After resilvering, it didn't put pool ONLINE but inDEGRADED state as shown above.

Detaching ada0/old or <long number> don't work
I get
cannot detach: no valid replicas
 

benjy23

Dabbler
Joined
Jun 3, 2011
Messages
21
Just thought I'd update my status rather than just leaving things hanging. I later ran into a problem where I could not transfer any files out of my NAS but I could see them.
Then I went and check the disks through the WEBGUI and though they are there.... somehow they aren't recognized... 2 out of 6 of them.
I then just unplugged all the hard drives and just plugged them all back and VOILA it started to work again but in a Degraded state, whether its resilvering or not I didn't check.

To me what was important was getting my files out of that NAS that has become unreliable in my mind. Don't get me wrong, Freenas has given me a great deal of a solution
which lasted me well over 2 years but hardware wise I would think, probably just getting unreliable especially since swapping of drives isn't something simple through the WEBGUI.

I bought an off the shelf solution. Painful as it is, I hope it saves me time in future and I wouldn't have to DIY so much. I thank this community for trying to help me with my issue
but alas, I think I'm gonna stop my own DIY especially when things can get so complex that I can't diagnose the problem. I will say this though, I never knew some of the features Freenas has
until I bought an off the shelf solution, link aggregation being 1 of them, Freenas truly is pretty powerful. but also uses lots of electricity :p

Thanks again, maybe 1 day I'll be coming back to Freenas but I don't think that day will be too soon.

P.S. Thanks Protosd and Gcooper :)
 

dstwins

Cadet
Joined
Apr 22, 2012
Messages
8
I have the same problem.. it seems the instructions in the guide are incomplete..

What is interesting is I had two drives fail on me in my configuration.. I have 4 x 2TB drives (zpool name: Thing2) and 6 x 1TB drives (Zpool name: Thing1). For thing2 I was able to pop in another drive, run the "Replace" command from the GUI and presto.. 5 hours later.. everything is perfect.. but for Thing1, i keep getting Middleware errors about the drive being too small even though its a 1TB drive.. All the drives use 4K sectors. So I'm doing it via the CLI but there is a lot of dissenting opinions on the "right way" to do things.

So far, I've followed the steps in the 8.0.3 guide (though running 8.0.4p1.. I switched to 8.2.0-BETA3 and it has the same problems) guide.. With the Zpool replace Thing1 <long numerical value of old drive> ada5 and that works after a very long re-silvering process, but after that, the CLI detach doesn't work since there is no "old" anymore.. I can do a detach from the GUI but I remember reading that mixing and matching CLI and GUI commands are a sure way to break things.
 

Erwin

Dabbler
Joined
Sep 21, 2011
Messages
30
I have the same problem.. it seems the instructions in the guide are incomplete..

What is interesting is I had two drives fail on me in my configuration.. I have 4 x 2TB drives (zpool name: Thing2) and 6 x 1TB drives (Zpool name: Thing1). For thing2 I was able to pop in another drive, run the "Replace" command from the GUI and presto.. 5 hours later.. everything is perfect.. but for Thing1, i keep getting Middleware errors about the drive being too small even though its a 1TB drive.. All the drives use 4K sectors. So I'm doing it via the CLI but there is a lot of dissenting opinions on the "right way" to do things.

So far, I've followed the steps in the 8.0.3 guide (though running 8.0.4p1.. I switched to 8.2.0-BETA3 and it has the same problems) guide.. With the Zpool replace Thing1 <long numerical value of old drive> ada5 and that works after a very long re-silvering process, but after that, the CLI detach doesn't work since there is no "old" anymore.. I can do a detach from the GUI but I remember reading that mixing and matching CLI and GUI commands are a sure way to break things.

I followed the thread with a lot of interest, because this is the most important part of the whole story: To maintain the redundant system over the time. You have to accept that disks will break over the time, but you cannot accept that the repair process fails.

Just my personal remarks:

Don't wait with learning the workflow until a real problem occures. It is so easy to set up a mini freenas system with lots of drives in VMware (or other virtualization platforms). Simulate a disk problem and try to repair. And document this what you have learned for the day when you will have to fix your production unit.

Speeking about dokumentation: The freenas guide is not very detailed in describing a disk failure case in the chapter "Replacing a failed Drive". And the chapter later on in the appendix "How do I replace a bad drive?" is, in my eyes, complete bullshit, as it lists commands which will not work at all, at least not on a freenas device.

But back to you 6 drive Zpool: I am running a similar configuration, but with 6 x 3TB WD30EZRX drives. And ZFS is a killer feature. I say this after years of experience with different types and configurations of "traditional" RAID systems. So I decided to set up a Raidz2, to have the possibility to replace a disk without loosing redundancy. I purchased the drives (antediluvian, meaning before the big flood) from different dealers. One of the drives had more vibrations and was getting warmer than the others. This units was now, after 6 months, getting more and more Offline_Uncorrectables and Current_Pending_Sectors. So I ordered a replacement unit (advanced replacement by WD). The unit I got was a bit different (WD3009FYPX, obviously the brand new Enterprise grade 3 TB drive, which was not listed in the WD website). But it had exact the same size in blocks and bytes.

I am using FreeNAS-8.0.4-RELEASE_MULTIMEDIA-p1-x64 (11076)

To get all this informations, use for example smartctl -a /dev/ada3 <--ada3 was my problem disk

Then I disabled the aktive services, like CIFS and Appletalk (as the Macbook of my daughter is doing TM backups the whole day..)

I shut down the NAS.

I replaced the problem disk against the new one.

I rebooted the NAS again.

In the GUI, is did a View All Volumes (was degraded as expected) and, for this volume, a View Disks (displaying the removed disk with a Serial "unknown"). The other disks, ada0, ada1, ada2, ada4, ada5 are still displayed with a real Serial.

I selected exactly this "Unknown" entry (the last shadow of the died, and removed disk) and did a Replace. Now it offers as Member Disk "in-place (ada3) 3.0TB))". After this, the new disk appeared in the list as ada3 with the new Serial. The old entry was still here.

I had a look on the Zpool in the CLI:
[root@freenas] ~# zpool status
pool: ZFS6x3T
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h1m, 0.07% done, 27h9m to go
config:

NAME STATE READ WRITE CKSUM
ZFS6x3T DEGRADED 0 0 0
raidz2 DEGRADED 0 0 0
gptid/4a6f4bd2-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
gptid/4b4370a8-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
gptid/4c19e405-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
replacing DEGRADED 0 0 0
12902692225261382052 UNAVAIL 0 0 0 was /dev/ada3p2/old
ada3p2 ONLINE 0 0 0 1.85G resilvered
gptid/4dc98083-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
gptid/4ea2e01f-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0

errors: No known data errors
[root@freenas] ~#

Then I did, in the GUI, a "Detach" on the entry of the old disk.

Then again a view on the Zpool in the CLI:
[root@freenas] ~# zpool status
pool: ZFS6x3T
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h2m, 0.15% done, 24h28m to go
config:

NAME STATE READ WRITE CKSUM
ZFS6x3T ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gptid/4a6f4bd2-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
gptid/4b4370a8-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
gptid/4c19e405-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
ada3p2 ONLINE 0 0 0 3.93G resilvered
gptid/4dc98083-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0
gptid/4ea2e01f-e16b-11e0-8df7-f46d04d89638 ONLINE 0 0 0

errors: No known data errors
[root@freenas] ~#

Now I have to wait until the resilvering process is finished. The Zpool is already Online again. The magic partition ID will stay with the short variant ada3p2, but this is not an issue at all.

This took, without the resilvering process, not more than 15 minutes for me. What I wanted to say is, that the replacement of a bad disks seem to work strait on forward in the GUI.

erwin
 

fisheater

Explorer
Joined
Jun 29, 2011
Messages
53
Hi,

I was reading this thread, then was unsure what my swap size was. Two Qs: 1. can i find that info in the web gui? 2. if not how to telnet in and do it via command line?

I would like to know if there is going to be a more complex problem if a drive fails, before it happens. With my level of understanding, it would be tricky enough to swap a drive with the current method of replacement.

Thanks in advance,

FE
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
In order to get to the command line from another machine, you'll want to enable SSH (under services).

If you're using Windows, you can use the PuTTY client to connect to your FreeNAS box.

I believe that by default, it's 2Gb per disk. From the command line, do a "swapinfo" to get the info.

2. if not how to telnet in and do it via command line?



edited: added swapinfo command
 

fisheater

Explorer
Joined
Jun 29, 2011
Messages
53
Thanks for the info, I managed to find the size of my swap in the GUI. Here is how, for those fellow SSH refugees.

Login web GUI -> System -> Settings -> 'Advanced' tab, 5th line down “Swap size on each drive in GiB, affects new disks only. Setting this to 0 disables swap creation completely (STRONGLY DISCOURAGED).” 2.

I tried to log in via Putty on my linux desktop, but unable to get past the login. Tried admin:GUI password, GUI login:GUI password, admin:freenas. No luck.

Thanks!

FE
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I tried to log in via Putty on my linux desktop, but unable to get past the login. Tried admin:(GUI password), (GUI login):(GUI password), admin:freenas. No luck.

The GUI login/password "admin" is strictly a GUI login account, you'd need to create a new user to login with SSH, which can also be called "admin" because it's separate from the GUI username.

The place you found in the GUI for showing swap size doesn't tell you what your existing swap sizes are. Once 8.2 is released you won't need to worry about Putty since you'll be able to open a Shell from the GUI and run commands like "swapinfo".
 
Status
Not open for further replies.
Top