Help! The Mirrored boot volume seems to have died.

Status
Not open for further replies.

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
I got an notice after one of my scheduled smart tests that one of the 2 thumb drives that make up the mirrored boot volume on my system had died. When I tried to access the FreeNAS GUI, I found it to be unresponsive. The web page simply would not open. I opened the console using the IPMI connection and I have a ton of errors showing on the screen but I can't get to a prompt. The thumb drive that has the error is politely flashing a light, so I know which one needs to be replaced, but I can't find a way to gracefully shut down the system or even verify I have the configuration saved off successfully.

Is there a way for me to gracefully shut down the server without access to the FreeNAS console? Can I issue the shutdown command from the IPMI without damaging the data on the array?
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Are you still able to access your shares? My guess is no. When your boot volume dies, you've pretty much shut down, gracefully or not.

However, your boot volume should not be "dead" if only one stick is gone. Are you sure the one that's bad is the one that's flashing?
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
Are you still able to access your shares? My guess is no. When your boot volume dies, you've pretty much shut down, gracefully or not.

However, your boot volume should not be "dead" if only one stick is gone. Are you sure the one that's bad is the one that's flashing?


I am not sure it is the one with the flashing light. Both the Plex and PlexPY applications on the server are up and running. I am pretty sure that there is at least one boot drive functioning, I was just hoping that I would be able to issue a shutdown on the freenas itself.

I have noticed that I can't get to the shared folders on the server from any other computer on the network. I had shared out the media library so that my download drone could add movies and TV shows to the system.

These are the errors that I received:

1st:
Code:
The boot volume state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

2nd:
Code:
The boot volume state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.

3rd:
Code:
Checking status of zfs pools:
NAME			 SIZE	   ALLOC	  FREE	 EXPANDSZ	  FRAG	  CAP	 DEDUP	 HEALTH	   ALTROOT
Jails			   109G	   28.7G	   80.3G		 -			   58%	   26%	  1.00x	 ONLINE			 /mnt
freenas-boot   14.9G	  2.07G	   12.8G		 -				 -		   13%	  1.00x	 DEGRADED			-
tank				  29T	  16.3T		12.7T		 -			   15%	   56%	  1.00x	 ONLINE			 /mnt

  pool: freenas-boot
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
		Sufficient replicas exist for the pool to continue functioning in a
		degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
		repaired.
  scan: scrub repaired 0 in 0 days 00:01:50 with 0 errors on Thu Feb 15 03:46:50 2018
config:

		NAME				 STATE		   READ		  WRITE		 CKSUM
		freenas-boot	   DEGRADED	   0				 0				 0
		  mirror-0			DEGRADED	   0				0				  0
			da8p2			 FAULTED		  0			   242			   0  too many errors
			da9p2			  ONLINE			0			  0				   0

errors: No known data errors

-- End of daily output --

The thumb drives are in the back of the server. The thumb drive in the top USB port is the one that is flashing. I have more thumb drives on order and they should get here today, but I didn't have a duplicate thumb drive available to replace the bad one.
 
Last edited by a moderator:

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
It looks like everything failed "gracefully" from the error message. At first, your system detected an error, but was able to correct it. However, some additional error(s) caused the system to determine that da8p2 "FAULTED", which caused the mirror to become degraded, as would be expected.

Are you able to access your shares? Can you SSH in?
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
No, the Plex, PlexPY and FreeNAS are not letting me ssh in.
 
Last edited by a moderator:

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
At this point, I would issue the shutdown command in IPMI, and hope for the best.
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
I received my new flash drives today. So I shutdown the host using the IPMI console, replaced the faulty flash drive and then rebooted. The system came up fine and it doesn't look like I lost anything. The boot status page showed a "degraded state" with one of the 2 drives unavailable.
upload_2018-2-20_12-44-34.png


Do I detach and then replace or just replace?
 
Last edited by a moderator:

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Just replace. Once it's done, then you can pull the old drive out.
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
Just replace. Once it's done, then you can pull the old drive out.
so, I don't think this is normal:

upload_2018-2-20_13-0-9.png


I highlighted da8p2 and clicked "replace". This window pops up and I can't input a member disk ID in the drop down (nothing shows up)

Also of interest, da8p2 now shows up as "online"
upload_2018-2-20_13-22-32.png
 

Attachments

  • upload_2018-2-20_13-22-15.png
    upload_2018-2-20_13-22-15.png
    11.6 KB · Views: 216
Last edited:

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
That is not normal. I would make sure you back up your config, and then run a scrub on your boot pool. Have you updated to 11.1U1?
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
Yes
upload_2018-2-20_14-5-1.png
 
Last edited by a moderator:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
The flashing thumb drive was possibly the still active one.

I use a dd read from the good drive to tell which is the bad, as the read will activate the drives activity light.

Shouldn’t have frozen out your UI.

But IPMI Shutdown should inititate a graceful shutdown, if it’s possible.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
It would make your life much easier in the future if you replaced the USB drives with small SSDs. Massively faster and more reliable.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
It would make your life much easier in the future if you replaced the USB drives with small SSDs. Massively faster and more reliable.
Yes, it's exactly the scenario the OP is recounting that got me there. Same sticks and possible confusion created by the flashing LEDs, too. I had a used laptop HDD on the shelf, stuck that in as a temporary (worked fine as a boot, slow but fine, still have it on hand as a spare), then got a used SSD on eBay. Never again a USB stick as boot device for FreeNAS for me.
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
so I copied off the configuration and ran a scrub boot. da9p2 is showing as Faulted and the boot volume as degraded. I am still having the issue where the system will not let me designate the drive when I try to replace the disk. Is there a way to do this from command line? Do I have any options on this?

I downloaded and installed the 11.1.U2 update. the system seemed to install it just fine and rebooted as normal. The Boot status page now looks like this
upload_2018-2-21_11-27-36.png


I am still unable to input the drive name in the popup window when I click on the "replace" radio button.
 
Last edited:

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
Does anyone have any advice on how to get around this? Can I simply "Detach" the bad drive and attach a new one? I have other thumb drives that I could copy the config and OS to.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
You should be able to do this. It's weird that you can't use the replace GUI window.
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
well, this is what I got back:
Code:
Request Method: POST
Request URL: http://192.168.0.23/system/bootenv/pool/detach/10908009586187677159/
Software Version: FreeNAS-11.1-U2 (c636d1f4b)
Exception Type: ClientException
Exception Value:
[EINVAL] Failed to find vdev for 10908009586187677159
Exception Location: /usr/local/lib/python3.6/site-packages/middlewared/client/client.py in call, line 394
Server time: Thu, 22 Feb 2018 12:47:20 -0800
Traceback

 

Request information
GET
No GET data

POST
Variable Value
__form_id 'form_str'
FILES
No FILES data

COOKIES
Variable Value
fntreeSaveStateCookie 'root%2Croot%2F1%2F6%2Croot%2F9%2Croot%2F52%2Croot%2F52%2F53%2Croot%2F52%2F53%2F54%2Croot%2F9%2F16%2Croot%2F1%2F2%2Croot%2F155%2Croot%2F163%2Croot%2F172%2Croot%2F172%2F188%2Croot%2F168%2Croot%2F167%2F178%2Croot%2F167%2F178%2F180%2Croot%2F10%2F17%2Croot%2F24%2F25%2Croot%2F167%2Croot%2F167%2F173%2Croot%2F167%2F173%2F175%2Croot%2F166%2Croot%2F164%2Croot%2F166%2F172%2Croot%2F166%2F172%2F174%2Croot%2F55%2F56%2F73%2Croot%2F55%2F56%2F73%2F74%2Croot%2F55%2F56%2Croot%2F55%2Croot%2F162%2Croot%2F162%2F168%2F170%2Croot%2F139%2Croot%2F24%2Croot%2F24%2F34%2Croot%2F162%2F168%2Croot%2F166%2F182%2F184%2Croot%2F166%2F182%2Croot%2F167%2F183%2Croot%2F167%2F183%2F185%2Croot%2F56%2Croot%2F56%2F57%2Croot%2F171%2Croot%2F171%2F193%2Croot%2F171%2F193%2F195%2Croot%2F56%2F57%2F58%2Croot%2F188%2Croot%2F83%2Croot%2F83%2F84%2Croot%2F83%2F84%2F85%2F86%2F99%2F100%2F101%2Croot%2F83%2F84%2F85%2F86%2F99%2Croot%2F83%2F84%2F85%2Croot%2F83%2F84%2F85%2F134%2Croot%2F67%2Croot%2F67%2F77%2Croot%2F67%2F69%2Croot%2F10%2Croot%2F10%2F23%2Croot%2F52%2F62%2Croot%2F87%2Croot%2F87%2F185'
csrftoken '********'
sessionid 'fwyr3ipctzp1rzx8rrec8hwbuy7ry5xo'
META
Variable Value

Not sure if that is good or bad
 
Last edited by a moderator:

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
That is not good. We're probably going to have to dive into the command line to get this solved.
 

geopfarth

Explorer
Joined
Apr 14, 2017
Messages
96
well, I have 4 new USB drives, and I have the 11.1 ISO downloaded and burned onto another one in case we have to start from scratch. I have the configuration saved off to another machine. If you have a list of commands you would like me to run, I'm game! And I'm grateful for the help
 
Status
Not open for further replies.
Top