WebGui Stopped Working Replacing Drive

Status
Not open for further replies.

ss4johnny

Explorer
Joined
Nov 15, 2013
Messages
55
I think I've screwed replacing a drive up. Here's what I've done.

I got an email that a disk was degraded (I'm using ZFS2 on my Freenas). I went into the webgui, identified the disk (/dev/ada2) and physically replaced it with a new one. When I boot up the machine, the first time it failed to mount and start Freenas. So I turned it off and on again. Now it goes to the main screen, but I can't connect to the WebGui (which pisses me off to no end). I figure I need to replace the drive manually and then the WebGui will work again.

This is where things get a little tricky. I know I need to run zpool replace, but the zpool status is only giving me some id information, rather than simple things I recall from the gui (like /dev/ada2, I have a total of 6 drives so it goes ada0-5). Since I had restarted the drives, I couldn't get the id information from before, but there is an obviously new one showing up that is just a number with no letters (didn't work to use that as a target for zpool replace). So I then ran glabel status, which listed ada0 through ada4 with some others that I wasn't sure of. I figured that the machine considered the ada2 not there, so then it moved down what was ada3-5 to ada2-4. So then I ran

zpool replace [name of Volume] [some id number that the zpool status is using for the failed drive] /dev/ada5

It is re-silvering now, but I'm really unsure if I did the right thing of using ada5 rather than ada2. So I have two main questions, 1) did I do the right thing, 2) if I did the wrong thing, should I cancel the re-silver (is that possible?), or should I wait for it to finish and do something else. I have sufficient spare drives that after this resilver is done, I could replace another one properly.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
It is re-silvering now, but I'm really unsure if I did the right thing of using ada5 rather than ada2. So I have two main questions, 1) did I do the right thing, 2) if I did the wrong thing, should I cancel the re-silver (is that possible?), or should I wait for it to finish and do something else. I have sufficient spare drives that after this resilver is done, I could replace another one properly.
This is going to be one of those hand-wavy posts ...

I'm not sure if you did the right thing, but you probably did what you had to do in the circumstances. I don't think canceling the re-silver makes any sense.

I don't think ZFS would have started re-silvering if you had picked the wrong drive. If the drive was already part of a pool, can't do that. If the drive was too small (your boot device), can't do that.

But you still have the issue of a non-working GUI to troubleshoot. I don't know why a failed drive would cause that, but maybe someone else has seen it. Are there other issues with your system, e.g. not meeting minimum requirements for RAM or similar? Maybe you'll get lucky and it will come back after the re-silver. If not, you might consider a clean install of FreeNAS followed by a restore of your backed up config file.
 

ss4johnny

Explorer
Joined
Nov 15, 2013
Messages
55
Thanks for the reply. I had been using the machine for a while without issues. Plenty of ram etc. However I have had the issue of the guy not working before and I did a clean install that time I think. At this point I'll just hope for the best and check on the machine tomorrow.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
List specific hardware specs like the forum rules say to. Also you didn't do it correctly because you didn't use the GUI, you are most likely going to have to rebuild your pool from scratch to get it stable.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah, I don't have the same sentiment as Robert has. You did it from the CLI. You can be incredibly stupid and do things like put the same disk twice in the same vdev, the same disk in two different vdevs, etc.

Resilvering cannot be "cancelled". It will auto-restart if you remount the pool.

Unfortunately, you are basically committed to this, for better or worse. You're also taking risks by thinking the CLI is the way to do business. While I hope you don't lose your data, if your next post is "my pool won't mount on reboot" I'm not going to bother to respond because you didn't follow the manual to begin with.

There's also no such thing as ZFS2, so you might want to actually use proper terminology when you are asking a question.
 

ss4johnny

Explorer
Joined
Nov 15, 2013
Messages
55
It finished re-silvering today. It was showing 6 drives online. The non-working drive was still there. I tried to detach it with zpool detach [x] where x is whatever information was in the zpool status screen for that drive, but it didn't work. So I tried restarting it. I got some Status ATA errors that it kept saying retrying, but eventually it said it stopped retrying and took me to the main FreeNas screen. I ran zpool status -v and now it has 5 (rather than six) online devices and the faulted one is still there. It was as if the restart removed something. However, the ada5 from the zpool replace is listed there as on line (wasn't before). So now I'm not sure what's going on. When I run camcontrol devlist, ada5 is a 4GB drive, but I'm not sure if it is the right 4GB drive (there's also one listed on ada4).

I'm not sure on my next steps now. Before when there were 6 online drives, I was pretty confident in running zpool detach, but it was weird that that didn't work. Now that there are 5, I'm not sure that makes sense. zpool status also tells me I can run zpool clear to remove transient errors, but I'm not sure if I should do that either.

@SweetAndLow I had provided what I thought were the most relevant details, sorry if that was not sufficient. If I have to rebuild from scratch, wouldn't that mean I lose all my data?

Processor: Intel Pentium Dual Core G2030 (3GHZ)
Motherboard Supermicro MBD-X9SCM-F-O LGA 1155
RAM: 32GB of ECC RAM
Hard Disks: 6 Western Digital (4 3GB Greens, 2 4GB Reds, one of the Reds is new, it was replacing a Green drive)
I currently have two ethernet cables connected to the motherboard, through a switch, and into a router.

@cyberjock I've read the manual section on replacing drives, though perhaps not the whole manual. I've replaced drives before using the GUI. I wanted to do it through the GUI. The GUI stopped working the moment I switched out my faulted drive. That's the problem (and it has happened before).

The weirdest part is that the FreeNas console was telling me to go to an IP address that the machine is pretty clearly not connected to. When I went into my router today, I noticed that I had a wired device attached on a different IP address than usual. This address takes me to a SuperMicro log in page (and I only have the one SuperMicro board, so I know this is my NAS). I got instructions to get into this through the motherboard manual, but I'm not sure it gets me anything to open it up. The console redirection wasn't working.

I had confused RAIDZ2 with ZFS2. Sorry. But you're right, one big concern that I had was that when I ran the zpool replace was that I would mess something up. Robert Trevellyn seemed to suggest that my worries were a bit overblown, but now you've got me worried again.

EDIT: I was just thinking that a probable source of the reason why the GUI stopped working is that I had 6 drives in 6 slots. So I to remove one of the drives to put a new one in, which I assume the GUI didn't like (for whatever mysterious reasons). I would guess if I have 5 drives in 6 slots, one fails, then I add a new one, it would still work. Then I could remove the relevant one as appropriate. If this is the case, it should either be fixed or communicated to people better.
 
Last edited:

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Using the zpool replace did mess up your pool. Your theory on why the GUI stopped working is incorrect. I have the theory that your nas was configured with dhcp address and when you rebooted it to replace the drive it got a new ip address and you where connecting to the old one. You are most likely going to need to backup your data rebuild your pool and read the manual cover to cover twice.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
The weirdest part is that the FreeNas console was telling me to go to an IP address that the machine is pretty clearly not connected to. When I went into my router today, I noticed that I had a wired device attached on a different IP address than usual. This address takes me to a SuperMicro log in page (and I only have the one SuperMicro board, so I know this is my NAS). I got instructions to get into this through the motherboard manual, but I'm not sure it gets me anything to open it up. The console redirection wasn't working.
To me, this looks like a symptom of some kind of conflict between your motherboard's IPMI interface and the FreeNAS GUI - probably with a dose of DHCP mixed in as SweenAndLo suggests.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
The weirdest part is that the FreeNas console was telling me to go to an IP address that the machine is pretty clearly not connected to.
This is why your GUI looks like it is broken. It likely isn't broken at all. You need to configure dhcp correctly, or set the interface up manually. For troubleshooting give it a manual IP outside the dhcp range on your router. That gives you an exact address to hit. The console won't lie or get something wrong. Also skip IPMI for now and use physical access to keep things simple.

The guys on the "broken network" or conflict track are all over this. Last time you broke your GUI, the reinstall reset to the default network parameters which typically work, and got you back up. You just experienced, likely, the same network issue again. It has absolutely nothing to do with your pool disks or what you may have plugged in to the system. The trigger was the reboot on device change.

I doubt you've fubar'd your pool at this stage. But who really knows. Fix the network problem. Log in to the GUI. Post the details of the pool. The trick here is to just relax and tackle one thing at a time. If you feed this board good info, they will give good advice.

A possible shortcut would be remove the existing USB install and set it aside. Run a clean install (hoping the default network settings will work as they did before). Auto-import the pool. Log in to the GUI and see what's up. Haven't seen you list what version of FreeNAS you are running.

Don't describe the output of 'zpool status' post the data in code tags. We can tell instantly what is going on... while the descriptions are nebulous.

I suppose I'm an optimist, but I'm not scared for you yet. System and data are separate for a reason. 1 failed drive in a raidz2 array is not a cause for concern.

Good luck, you'll get there.
 

ss4johnny

Explorer
Joined
Nov 15, 2013
Messages
55
Thanks for the reassurance mjws00, and the optimism. I didn't have much luck with the DHCP suggestion. I had thought everything previously was correctly set up, as I haven't had a problem reaching it before. I tried a few things. The thing I was most convinced would work was The dhcp range on the router went up to 192.168.1.254, so I changed it to 192.168.1.253 and then set the FreeNAS router to 192.168.1.254, but that didn't work.

Last time I had the problem with not being able to access the pool, I think I had used a clean install on a new pool like you had suggested. I think it's coming to that again. I had wanted to avoid it.

I think I have a pretty recent version of FreeNAS, upgraded it late last year. It's easier to check that in the GUI, and I've powered off my machine for now (I'm going to do the USB thing tomorrow after work).
 

ss4johnny

Explorer
Joined
Nov 15, 2013
Messages
55
With fresh installs of FreeNAS, I was able to get into the GUI. Auto-importing the pool worked, though it did about another hour's worth of re-silvering before being perfectly fine. I'm still having some lingering issues connecting to the GUI. Not initially, but when I play around with settings, which leads me to believe it's my own damn fault. Now that I have a better idea of what's not working, I think I can get it more stable when I work on it later today.

One lingering concern, which I could post somewhere else if that's appropriate.

I'm starting completely over with new settings. However, my jails, including a rather large MySQL server, are still on the machine. Obviously, none of these are showing up in the Plugins or Jails section, only in the Volume and Dataset parts of the Storage tab. Is there any way to get them back how they were before? Or, should I remove the datasets and start them over all well?
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I'm starting completely over with new settings. However, my jails, including a rather large MySQL server, are still on the machine. Obviously, none of these are showing up in the Plugins or Jails section, only in the Volume and Dataset parts of the Storage tab. Is there any way to get them back how they were before? Or, should I remove the datasets and start them over all well?
You should be able to get your jails back by going to Jails Configuration and setting Jail Root correctly. Doesn't seem to be as straightforward with plugins.
 
Status
Not open for further replies.
Top