ZFS Pool Degraded after GUI Update to 8.2 : Please Help me Understand the Error Info!

Status
Not open for further replies.

xtracold

Dabbler
Joined
Jul 27, 2011
Messages
10
Hi there,

So I updated via the GUI to v8.2 last night. It all looked to have gone well. But then I noticed the alert was yellow so I clicked on the alert and the message is

"WARNING: The volume NAS_1_2 (ZFS) status is UNKNOWN: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state.Attach the missing device and online it using 'zpool online'."

My system is two 1TB drives configured to mirror each other, they run inside a VM on a Linux Mint workstation. They have ran for months quite happily until now...

Rather than panic (which I am on the edge of) I browsed around the GUI to see what information I could gather on the problem.

The volume status is listed as follows

volStatus.jpg

This would say to me that ada2p2 is working fine and the problem is with the other disk. I struggle to believe the coincidence of it failing during the upgrade but i'll run with the theory for now.

From searching around the web it seems "zpool status" and "gpart show" would help guide me to the problem disk and path to a resolution.

Here is the output of zpool status

Code:
[root@freenas ~]# zpool status                                                  
  pool: NAS_1_2                                                                 
 state: DEGRADED                                                                
status: One or more devices could not be opened.  Sufficient replicas exist for 
        the pool to continue functioning in a degraded state.                   
action: Attach the missing device and online it using 'zpool online'.           
   see: http://www.sun.com/msg/ZFS-8000-2Q                                      
 scrub: scrub in progress for 0h33m, 40.98% done, 0h48m to go                   
config:                                                                         
                                                                                
        NAME                                            STATE     READ WRITE CKS
UM                                                                              
        NAS_1_2                                         DEGRADED     0     0    
 0                                                                              
          mirror                                        DEGRADED     0     0    
 0                                                                              
            5644546525308100890                         UNAVAIL      0     0    
 0  was /dev/gptid/5f444b11-32c0-11e1-abe8-080027430ca7                         
            gptid/5fc15fbc-32c0-11e1-abe8-080027430ca7  ONLINE       0     0    
 0                                                                              


This seems consistent with the GUI information.

Here is the output from gpart show

Code:
[root@freenas ~]# gpart show                                                    
=>     63  6291369  ada0  MBR  (3.0G)                                           
       63  1930257     1  freebsd  (943M)                                       
  1930320       63        - free -  (32K)                                       
  1930383  1930257     2  freebsd  [active]  (943M)                             
  3860640     3024     3  freebsd  (1.5M)                                       
  3863664    41328     4  freebsd  (20M)                                        
  3904992  2386440        - free -  (1.1G)                                      
                                                                                
=>        34  1953519935  ada2  GPT  (932G) [CORRUPT]                           
          34          94        - free -  (47K)                                 
         128     4194304     1  freebsd-swap  (2.0G)                            
     4194432  1949325537     2  freebsd-zfs  (930G)                             
                                                                                
=>      0  1930257  ada0s1  BSD  (943M)                                         
        0       16          - free -  (8.0K)                                    
       16  1930241       1  !0  (943M)                                          
                                                                                
=>      0  1930257  ada0s2  BSD  (943M)                                         
        0       16          - free -  (8.0K)                                    
       16  1930241       1  !0  (943M)                                          


I am not fully understanding what this is telling me, is this saying ada2 is corrupt and therefore the problem disk? Is that not in conflict with the zpool information?

Any help or guidance you can given me is much appreciated. If I can locate the real problem disk I would be open to removing it from the pool and trying to reinstate it. I assume if it was recognised I would be able to format it and start cleanly, with the mirror rebuilding against it.

Thanks again for any help, as you can likely tell the ZFS pools etc are all quite new to me.

Jamie
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Since you are running in a VM machine you should look at the vmdk files and see if they are available. You have an extra layer of potential issues because you are running it in a virtual machine. I'd recommend you backup the data in your FreeNAS server if is important before trying to troubleshoot the issue. The vmdk file could also be corrupt, in which case the solution is to make a new vmdk and then resilver again. Of course, figuring out which vmdk is good and bad is hard to do if there aren't any blatant error messages hinting to the exact vmdk file. If you are using the direct disk mode(I forget the exact name for it) you obviously won't be looking for vmdk files and should start looking at your disks. I know in Windows direct disk mode doesn't work as well in practice as it does in theory. All sorts of nasty stuff can happen because windows will continually try to offer to format it for you(which if course you don't want to do).

It's really not recommended that users that are new to ZFS pools use VMs for anything except experimenting/testing because the extra layer of virtualization can cause you to lose data if you don't know exactly what you are doing.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
You forgot 2 useful commands:
Code:
zpool status -v

camcontrol devlist

gpart show

glabel status


Oh, and if you are unsure go with what the cli tells you.
 

xtracold

Dabbler
Joined
Jul 27, 2011
Messages
10
You forgot 2 useful commands:
Code:
zpool status -v

camcontrol devlist

gpart show

glabel status


Oh, and if you are unsure go with what the cli tells you.

Thanks, i'll try those two additional commands out.

I have since removed the vmdk's and rebooted to try and identify the disk causing trouble. The GUI information never changed and in fact I no longer expect it to. The SATA controllers apparently reindex and so ada1p2 etc etc will always be misleading - as you say stick to the CLI information!! Also found the following info on labelling the disks to remove doubt which I will follow in future http://forums.freebsd.org/showthread.php?t=28883.

Right now, I am cloning my virtual machines to increase the disk size. On trying to replace the disks Freenas kept hanging complaining about disk space issues. It seems my .vdi was at a fixed 3GB. I will increase to 6GB and continue my effort to resolve this issue.

In fairness, I do agree that the virtualbox setup adds an unecessary layer of complexity. I have been contemplating building a dedicated box up for some time so I might follow that route.
 

xtracold

Dabbler
Joined
Jul 27, 2011
Messages
10
So over the weekend I got the system to a point where it is recovered. First thing I did was to try and import the volumes from the disks to a clean install of the 8.2 running on a 8GB USB stick (no VM to contend with). The discs were not even recognised on this install.

So, I reinstalled my system entirely. A clean Linux Mint 13 install. Installed latest Virtualbox. Reinstalled Freenas 8.2 to this virtual machine. Recreated the .vmdk's for my data disks and navigated to the GUI to import the volumes. They were immediately picked up, all my data was accessible, but the volume was again degraded.

So I scrubbed the volume, then from the terminal executed "zpool status". The response was that everything was online but there was a reported error. So I ran "zpool clear" to get rid of that error flag, "zpool status" shows everything is good, BUT, the GUI still shows the amber alert occasionally.

As of last night the zpool status was returning no issues with the volumes and the alert was showing green and everything healthy. But I fully expect it to go amber at some point as I have not seen consistency between the GUI and zpool status reports.

So, two questions remain that if anyone has some input on it would be much appreciated

1. Why does a vmdk present the disks in a totally different way to a disc with direct access, what are the differences..what does virtual box do with the vmdk such that it presents the disc properly to a virtual machine?

2. Why does the GUI alert not show consistency with command line "zpool status"?

Finally...I will never run the GUI upgrade again, it screwed with my discs somehow, I will follow the slower more structured approach of a clean install, import config, reimport volumes manually. Oh...and make sure I backup first :)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
1. Why does a vmdk present the disks in a totally different way to a disc with direct access, what are the differences..what does virtual box do with the vmdk such that it presents the disc properly to a virtual machine?
This is a virtual box question. Perhaps it's best to ask in a virtual box forum? Personally, I have no idea.

2. Why does the GUI alert not show consistency with command line "zpool status"?
There is a time delay between the GUI and the cli.

Finally...I will never run the GUI upgrade again, it screwed with my discs somehow, I will follow the slower more structured approach of a clean install, import config, reimport volumes manually. Oh...and make sure I backup first :)
I doubt it screwed with your discs, but anything is possible. IMHO, this is the "best" way to upgrade and how I personally do it. Plus, it's simple to do with virtual box. No need to reimport the volumes. When you import the config it then knows about your existing volumes.
 
Status
Not open for further replies.
Top