FreeNAS Boots but becomes unresponsive

Status
Not open for further replies.

Jorg Roper

Cadet
Joined
Mar 19, 2017
Messages
3
Hi

I installed this system about 6 months ago using brand new hardware. It has run fine for the last 6 months. Over the weekend, the friend who I installed it for called to say his network drives were offline. Remote accessed his PC and couldn't access Freenas either via web interface or via SSH - website wouldn't load, SSH would prompt for username and password but not proceed. Got him to power cycle the server - it came back up, shares came back online. Great!

Logged in via SSH and looked at zpool status, saw a few issues so manually initiated a scrub which I left running as it was going to take hours. Zpool status was only showing 1 of the 2 hard drives with issues. Got a call this morning, same situation. Again, web interface and SSH unresponsive (beyond the username/password prompt in SSH). Power cycle got it up again and the web interface and SSH started working again. I was at work so only had time to confirm it was up and to grab the screenshot below. However, after a few hours the shares went offline again.

I'm at my friends house now and again the shares (SMB shares) are down, no response via web interface or SSH.

Any ideas on how to even diagnose this??? If I could get in, could at least look at logs etc but nothing available following reboot and it doesn't seem to be staying up long enough to get at logs with SSH working.

I'm going to try rebooting again now and see if I can find anything before SSH dies. I have tried searching but can't find solutions fitting this issue...

Thanks!


Hardware:
2 * WD 4TB CaviarRed 3.5' 64MB
16gb (2 sticks of 8gb) Kingston KVR16E11/8I 8GB 1600MHz DDR3 ECC
HP ProLiant MicroServer GEN8 G1610T

OS: FreeNAS 9.something - whatever the latest was about 2 months ago when I last patched it. Can't get in to see!

Screenshot of zpool status: http://imgur.com/a/IM2S9
 
Last edited by a moderator:

Jorg Roper

Cadet
Joined
Mar 19, 2017
Messages
3
Update:

After hooking up a monitor and watching it boot and almost immediately crash/restart (fortunately had phone video-ing screen so I could replay what happened), I could see errors on ada0. Powered down system. Removed one of the two 4tb hard drives and booted. System powered up normally, no errors. Web GUI and SSH now work. Shares came online. Did an immediate backup of config from system -> general.

There is now only 1 disk visible Storage -> View Disks (duh!). In https://doc.freenas.org/9.3/freenas_storage.html#replacing-a-failed-drive it talks about taking the disk offline before removing it from the ZFS pool. Can't do that in my case as the GUI won't stay up long enough with the failed drive in there.... Guess I power down, add a new disk and then use GUI to extend onto the new disk?
 

Jorg Roper

Cadet
Joined
Mar 19, 2017
Messages
3
Update:

Powered down the system, swapped out failed hard drive with brand new hard drive of same make/model/size. Powered up. Booted OK. Courtesy of a post in https://forums.freenas.org/index.php?threads/best-practices-for-drive-replacement.35359/ I figured out how to replace the failed drive. For anyone else looking: Storage -> select your Volume that had the failed drive in it, then look at the bottom of the screen for the icon that looks like a piece of paper (its the one on the right) which is "volume status". Click it. Wait 5-10seconds and the screen will load. In my case it showed the volume and dataset was "degraded" because while one hard drive was "online", the other was "unavailable" (because I had pulled it out of the server). Selected the unavailable volume and then clicked the button at the bottom that says "replace". It presents you an option to replace the failed drive with another - in this case I selected the one I had just added and hit OK. Wait 30 seconds and you start to see the ReSilver process running.

If you want to view updates on the resilver - storage -> select volume -> select "volume status" (as you did above) and you will see how far the resilver has progressed. At this point its 29% complete and still going...
 
Status
Not open for further replies.
Top