Degraded Pool; Volume UNAVAIL - What to do next

Status
Not open for further replies.

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
So I have a production NAS running ASRock Rack C2550d4i; 16GB Crucial DDR3,
2 3TB Hitachi HDs
1 4TB Western Digital HD
2 2TB Western Digital HD
1 6TB Seagate HD
1 120Gb Kingston SSD

I have 3 Pools
Code:
NAME		   SIZE  ALLOC   FREE  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
SETV_Archive  7.25T  4.73T  2.52T		 -	43%	65%  1.00x  DEGRADED  /mnt
SETV_Cloud	1.81T  2.26G  1.81T		 -	 2%	 0%  1.00x  ONLINE  /mnt
WowzaVOD2BAK  3.62T   423G  3.21T		 -	 0%	11%  1.00x  ONLINE  /mnt
freenas-boot  14.9G  6.70G  8.18G		 -	  -	45%  1.00x  ONLINE  -


I get an email this morning:
Code:
The volume SETV_Archive state is DEGRADED: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state.



Code:
root@setv-015-cloud:~ # zpool status SETV_Archive
  pool: SETV_Archive
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
		the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 0 in 0 days 09:29:05 with 0 errors on Sun Jun  3 09:29:10 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		SETV_Archive									DEGRADED	 0	 0	 0
		  mirror-0									  ONLINE	   0	 0	 0
			gptid/a7fe554b-7b91-11e6-8fe6-d050991b6521  ONLINE	   0	 0	 0
			gptid/a898ba6d-7b91-11e6-8fe6-d050991b6521  ONLINE	   0	 0	 0
		  mirror-1									  DEGRADED	 0	 0	 0
			14329082346761459697						UNAVAIL	  0	 0	 0  was /dev/gptid/95aeb18f-bbb7-11e7-af0e-d05099c2bc4f
			gptid/974d7470-bbb7-11e7-af0e-d05099c2bc4f  ONLINE	   0	 0	 0
		cache
		  gptid/a8d7e1e5-7b91-11e6-8fe6-d050991b6521	ONLINE	   0	 0	 0

errors: No known data errors




Code:
root@setv-015-cloud:~ # glabel status
									  Name  Status  Components
gptid/7b3a30e2-7bc8-11e6-9b8b-d050991b6521	 N/A  da0p1
gptid/7b47abcc-7bc8-11e6-9b8b-d050991b6521	 N/A  da0p2
gptid/973ad4f5-bbb7-11e7-af0e-d05099c2bc4f	 N/A  ada0p1
gptid/974d7470-bbb7-11e7-af0e-d05099c2bc4f	 N/A  ada0p2
gptid/38722602-11c3-11e8-bce2-d05099c2bc4f	 N/A  ada1p1
gptid/387f84c0-11c3-11e8-bce2-d05099c2bc4f	 N/A  ada1p2
gptid/a8d7e1e5-7b91-11e6-8fe6-d050991b6521	 N/A  ada2p1
gptid/4dd9389b-7b92-11e6-8fe6-d050991b6521	 N/A  ada3p1
gptid/4df0a430-7b92-11e6-8fe6-d050991b6521	 N/A  ada3p2
gptid/a7f7999a-7b91-11e6-8fe6-d050991b6521	 N/A  ada4p1
gptid/a7fe554b-7b91-11e6-8fe6-d050991b6521	 N/A  ada4p2
gptid/4ec4991f-7b92-11e6-8fe6-d050991b6521	 N/A  ada5p1
gptid/4ed6c50a-7b92-11e6-8fe6-d050991b6521	 N/A  ada5p2
gptid/a8920629-7b91-11e6-8fe6-d050991b6521	 N/A  ada6p1
gptid/a898ba6d-7b91-11e6-8fe6-d050991b6521	 N/A  ada6p2



Code:
root@setv-015-cloud:~ # camcontrol devlist
<ST3000DM008-2DM166 CC26>		  at scbus1 target 0 lun 0 (pass0,ada0)
<WDC WD40EZRZ-00GXCB0 80.00A80>	at scbus2 target 0 lun 0 (pass1,ada1)
<KINGSTON SV300S37A120G 603ABBF0>  at scbus5 target 0 lun 0 (pass2,ada2)
<Marvell Console 1.01>			 at scbus9 target 0 lun 0 (pass3)
<ST2000DL001-9VT156 CC41>		  at scbus10 target 0 lun 0 (pass4,ada3)
<HGST HDN726050ALE610 APGNT517>	at scbus11 target 0 lun 0 (pass5,ada4)
<ST2000DM001-1CH164 CC29>		  at scbus12 target 0 lun 0 (pass6,ada5)
<HGST HDN726050ALE610 APGNT517>	at scbus13 target 0 lun 0 (pass7,ada6)
<PNY USB 2.0 FD 1100>			  at scbus17 target 0 lun 0 (pass8,da0)




I know I am missing something because I cannot find /dev/gptid/95aeb18f-bbb7-11e7-af0e-d05099c2bc4f and don't know what to do from here.

I looked at other posts but I couldn't find any clear information. I don't have extra hard drives at the moment to backup ~4.75TB of data; how can I fix this? What went wrong and how can I find the problem drive? All of these FreeNAS labels don't clearly help me determine the drive or partition causing the issue.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Looks like your missing hard drive is not being recongnized at all by FreeNAS. My advive is to identify all the drives online by serial number, you can use smartctl -a /dev/ada0 and write down the serial number, do ada2 and the remaining drives until you have all the serial numbers. Next look at your physical drives to see which one is not listed. It also looks like you are missing partition 2 from ada2.

have you rebooted? Shutdown and power off (remove physical power), ensure the SATA connections are mated well, Power up and bootstrap the computer. See if the problem still exists.
 

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
Looks like your missing hard drive is not being recongnized at all by FreeNAS. My advive is to identify all the drives online by serial number, you can use smartctl -a /dev/ada0 and write down the serial number, do ada2 and the remaining drives until you have all the serial numbers. Next look at your physical drives to see which one is not listed. It also looks like you are missing partition 2 from ada2.

have you rebooted? Shutdown and power off (remove physical power), ensure the SATA connections are mated well, Power up and bootstrap the computer. See if the problem still exists.

I have rebooted the NAS and it still shows as UNAVAIL

I will try doing a smartctl on each of the drives until i find an issue. So far it appears to me that every drive is connected and seen by FreeNAS
 

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
So it does appear 1 of 2 3TB drives is just not being recognized. I have to get on-site to see what is going on and if a simple remove and replace works. I'm hoping I don't need to rebuild the zpool.

I am beating myself for lack of documentation, I built this 3-4 years ago and has been running with out intervention for months. I am wondering how I mirrored 2 3TB drives to 2 5TB drives and achieve a pool size of 7.2TB. Both 5TB are Mirror 0 and both 3TB drives are Mirror-1 -- so....how does that work? I want to go back in time and beat myself silly to document the how and whys lol
I feel like it would be a good idea to remove both 3TBs entirely and replace them with 2 5TB so both mirrors are the same type of drives (This was a really budget-tight build)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I know I am missing something because I cannot find /dev/gptid/95aeb18f-bbb7-11e7-af0e-d05099c2bc4f and don't know what to do from here.
You might want to implement some of these scripts to monitor the health of your system and one of them will even produce a list of drives:

Github repository for FreeNAS scripts, including disk burnin
https://forums.freenas.org/index.ph...for-freenas-scripts-including-disk-burnin.28/
So it does appear 1 of 2 3TB drives is just not being recognized. I have to get on-site to see what is going on and if a simple remove and replace works. I'm hoping I don't need to rebuild the zpool.
You should be able to simply resilver a new drive in place of the old. A pool of mirrors is not the most robust configuration.
2 3TB Hitachi HDs
1 4TB western digital HD
2 2TB Western Digital HD
1 6TB Seagate HD
1 120Gb Kingston SSD
I am having trouble figuring out which of these drives go where. You said that you have 3 pools, are some of them single drive, with no redundancy?
 

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
So this is the setup

WowzaVOD2BAK is a single drive pool, a secondary off-site backup of a server drive holding video files from another server. It's weird, i know but works for our use case.

SETV_Cloud has 2 drives; the two Western Digital 2.0TB drives as a 1:1 mirror with a capacity of 1.8TB

SETV_Archive is supposed to have 4 drives. This was the weird configuration. If I remember correctly it started as just the 2 5TB Seagates, with a 4.0TB pool size, then we ran out of space and added 2 3TB drives we had on hand to expand the pool to 7.2TB

Now, because of the two varying sizes, I am wondering the difficulty in swapping out the 2 3TBs with 2 more brand new 5TBs to make the pool a total of 10TBs with a 1:1 mirror.

If there is a more efficient method, let me know. Currently that pool sits at 4.7 TB and I really, really would prefer to not dump it to a single drive and rebuild a fresh pool.

Thanks for the advice, I am not as experienced with FreeNAS as I'd like to be.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Code:
root@setv-015-cloud:~ # camcontrol devlist
<ST3000DM008-2DM166 CC26>		  at scbus1 target 0 lun 0 (pass0,ada0)
<WDC WD40EZRZ-00GXCB0 80.00A80>	at scbus2 target 0 lun 0 (pass1,ada1)
<KINGSTON SV300S37A120G 603ABBF0>  at scbus5 target 0 lun 0 (pass2,ada2)
<Marvell Console 1.01>			 at scbus9 target 0 lun 0 (pass3)
<ST2000DL001-9VT156 CC41>		  at scbus10 target 0 lun 0 (pass4,ada3)
<HGST HDN726050ALE610 APGNT517>	at scbus11 target 0 lun 0 (pass5,ada4)
<ST2000DM001-1CH164 CC29>		  at scbus12 target 0 lun 0 (pass6,ada5)
<HGST HDN726050ALE610 APGNT517>	at scbus13 target 0 lun 0 (pass7,ada6)
<PNY USB 2.0 FD 1100>			  at scbus17 target 0 lun 0 (pass8,da0)
Your list of drives doesn't match this list from camcontrol, not very much, but I noticed that you only have a single USB memory stick PNY USB 2.0 FD 1100 for your boot device and I thought you might want to put a mirror device on that while you are working on the system so you don't have a boot device failure take the system down. You might also want to ensure there are good backups for any drives that are not in a redundant set.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
You and I must have posted at the same time.
SETV_Cloud has 2 drives; the two western digital 2.0TB drives as a 1:1 mirror with a capacity of 1.8TB
The camcontrol list shows there are no Western Digital 2TB drives. There are two Seagate 2TB drives:
Code:
<ST2000DL001-9VT156 CC41>		  at scbus10 target 0 lun 0 (pass4,ada3)
   gptid/4dd9389b-7b92-11e6-8fe6-d050991b6521	 N/A  ada3p1
   gptid/4df0a430-7b92-11e6-8fe6-d050991b6521	 N/A  ada3p2

<ST2000DM001-1CH164 CC29>		  at scbus12 target 0 lun 0 (pass6,ada5)
   gptid/4ec4991f-7b92-11e6-8fe6-d050991b6521	 N/A  ada5p1
   gptid/4ed6c50a-7b92-11e6-8fe6-d050991b6521	 N/A  ada5p2

SETV_Archive is supposed to have 4 drives. This was the weird configuration. If I remember correctly it started as just the 2 5TB Seagates, with a 4.0TB pool size, then we ran out of space and added 2 3TB drives we had on hand to expand the pool to 7.2TB
Looking at your camcontrol list, and the gptid info, these are the drives in the SETV_Archive pool:
Code:
<HGST HDN726050ALE610 APGNT517>	at scbus13 target 0 lun 0 (pass7,ada6)
   gptid/a8920629-7b91-11e6-8fe6-d050991b6521	 N/A  ada6p1
   gptid/a898ba6d-7b91-11e6-8fe6-d050991b6521	 N/A  ada6p2	 

<HGST HDN726050ALE610 APGNT517>	at scbus11 target 0 lun 0 (pass5,ada4)
   gptid/a7f7999a-7b91-11e6-8fe6-d050991b6521	 N/A  ada4p1
   gptid/a7fe554b-7b91-11e6-8fe6-d050991b6521	 N/A  ada4p2	 
<ST3000DM008-2DM166 CC26>		  at scbus1 target 0 lun 0 (pass0,ada0)
   gptid/973ad4f5-bbb7-11e7-af0e-d05099c2bc4f	 N/A  ada0p1
   gptid/974d7470-bbb7-11e7-af0e-d05099c2bc4f	 N/A  ada0p2

The two HGST drives (I looked up the model) are 4TB and the ST3000DM008 is a 3TB Seagate model.
Now, because of the two varying sizes, I am wondering the difficulty in swapping out the 2 3TBs with 2 more brand new 5TBs to make the pool a total of 10TBs with a 1:1 mirror.
There (based on the camcontrol list you provided) are not any 5TB drives in the system.
You can replace the UNAVAIL drive with a larger drive and then replace each of the other drives in the pool in turn, waiting for a resilver in between. Once complete, the pool should auto expand.
 

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
You and I must have posted at the same time.

The camcontrol list shows there are no Western Digital 2TB drives. There are two Seagate 2TB drives:
Code:
<ST2000DL001-9VT156 CC41>		  at scbus10 target 0 lun 0 (pass4,ada3)
   gptid/4dd9389b-7b92-11e6-8fe6-d050991b6521	 N/A  ada3p1
   gptid/4df0a430-7b92-11e6-8fe6-d050991b6521	 N/A  ada3p2

<ST2000DM001-1CH164 CC29>		  at scbus12 target 0 lun 0 (pass6,ada5)
   gptid/4ec4991f-7b92-11e6-8fe6-d050991b6521	 N/A  ada5p1
   gptid/4ed6c50a-7b92-11e6-8fe6-d050991b6521	 N/A  ada5p2

Seagate basically is Western Digital, I think a few drives we purchased that year from Amazon had either Seagate or WD branding on them I'll have to see what they are branded with when I get on-site.

Looking at your camcontrol list, and the gptid info, these are the drives in the SETV_Archive pool:
Code:
<HGST HDN726050ALE610 APGNT517>	at scbus13 target 0 lun 0 (pass7,ada6)
   gptid/a8920629-7b91-11e6-8fe6-d050991b6521	 N/A  ada6p1
   gptid/a898ba6d-7b91-11e6-8fe6-d050991b6521	 N/A  ada6p2	

<HGST HDN726050ALE610 APGNT517>	at scbus11 target 0 lun 0 (pass5,ada4)
   gptid/a7f7999a-7b91-11e6-8fe6-d050991b6521	 N/A  ada4p1
   gptid/a7fe554b-7b91-11e6-8fe6-d050991b6521	 N/A  ada4p2	
<ST3000DM008-2DM166 CC26>		  at scbus1 target 0 lun 0 (pass0,ada0)
   gptid/973ad4f5-bbb7-11e7-af0e-d05099c2bc4f	 N/A  ada0p1
   gptid/974d7470-bbb7-11e7-af0e-d05099c2bc4f	 N/A  ada0p2

The two HGST drives (I looked up the model) are 4TB and the ST3000DM008 is a 3TB Seagate model.

There (based on the camcontrol list you provided) are not any 5TB drives in the system.
You can replace the UNAVAIL drive with a larger drive and then replace each of the other drives in the pool in turn, waiting for a resilver in between. Once complete, the pool should auto expand.

That doesn't add up to what I see in the GUI
(Again, I have to wait until I get on-site to visually confirm the drives)
 

Attachments

  • NAS.PNG
    NAS.PNG
    125.1 KB · Views: 273

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
That doesn't add up to what I see in the GUI
(Again, I have to wait until I get on-site to visually confirm the drives)
Google gave me bad results. The two HGST drives, model "HDN726050ALE610 APGNT517" are 5TB Deskstar models.
They appear to have been discontinued though because they are not widely available.

Good luck.
 

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
Google gave me bad results. The two HGST drives, model "HDN726050ALE610 APGNT517" are 5TB Deskstar models.
They appear to have been discontinued though because they are not widely available.

Good luck.
So that's why I saw them on Amazon for $600 from resellers....

I'll be on-site in a few hours so I'll try a few things and report back
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
I got on site and did some quick testing. The drive has completely failed. The server blade wont even detect it. To ensure it wasn't a power issue, i just plugged a spare drive into the slot and turned on FreeNAS just to see if it showed up in the list of Disks and it did.

I'll have to put that drive through our lab and see why it failed, but it appears mechanical in nature. Total loss of the drive, however the pool is intact, we're overnighting a new drive to replace it and hopefully resilver the pool by next week barring any issues.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
If there is budget to do it, if they want to move to larger drives, it would be better to switch to a 4 drive RAIDz2. It would have the same relative capacity as a pair of mirrors, but any two drives could fail instead of the situation where if the wrong two fail, you loose the whole pool.
If there is interest in doing that, they would need to get the four new / larger drives, establish the new pool, and migrate the data over. With all the drives locally connected in the same chassis, it could be done in a matter of a few hours, depending on the amount of data involved.
 

nathank1989

Contributor
Joined
Aug 29, 2016
Messages
103
If there is budget to do it, if they want to move to larger drives, it would be better to switch to a 4 drive RAIDz2. It would have the same relative capacity as a pair of mirrors, but any two drives could fail instead of the situation where if the wrong two fail, you loose the whole pool.
If there is interest in doing that, they would need to get the four new / larger drives, establish the new pool, and migrate the data over. With all the drives locally connected in the same chassis, it could be done in a matter of a few hours, depending on the amount of data involved.

That’s not a bad idea. There will be a bit of a budget for upgrades in the near future.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
At the stage, your pool is just degraded, probably because of a failed drive. Replace the drive will fix your pool.

If you have a failed 3TB in a mirror of a 3 and 5, you should consider replacing it with something bigger, and when your other 3 fails, or you replace it anyway, you will be able to access additional space
 
Status
Not open for further replies.
Top