Degraded Volume, not sure which drive

Status
Not open for further replies.
Joined
Jul 3, 2015
Messages
926
Ok buddy, well best of luck and I hope you figure it out.
 

g4m3r7ag

Dabbler
Joined
Nov 25, 2017
Messages
31
Back to the original question. Is the spare showing as faulted in my initial post because it is currently resilvering? Once it's done resilvering can I just run an extended SMART on the drive that FreeNAS dropped from the mirror? If it passes an extended SMART how would I go about getting it re-entered into the pool and my spare back to being a spare? I believe the drive is not actually failing but rather the HBA due to over-heating incurred write errors causing FreeNAS to think the drive was failing and dropping it out of the pool. It is currently logging an occasional write error on the spare during the resilver and I am worried it is going to drop the spare before the resilver finishes. I find it highly unlikely that all of my drives started failing simultaneously after 6 months of use when they started experiencing ATA error counts but no SMART failures.
 
Joined
Jul 3, 2015
Messages
926
Honestly it looks odd. This is what one of my systems looked like earlier in the year when a drive failed and one of the spares kicked in.

EDIT: I tell a lie this was after I replaced the failed drive.

Code:
NAME											  STATE	 READ WRITE CKSUM
	tank											  ONLINE	   0	 0	 0
	  raidz2-0										ONLINE	   0	 0	 0
		gptid/c88f6323-3a0f-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/c8ffad58-3a0f-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/c973c524-3a0f-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/c9e95c86-3a0f-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/ca592c43-3a0f-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/cacf9d80-3a0f-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-1										ONLINE	   0	 0	 0
		gptid/20b6e5c7-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/212d601f-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/21a15bce-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/22140f72-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/22918479-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/2307ece0-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-2										ONLINE	   0	 0	 0
		gptid/6b6640b6-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6bd87b34-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6c4dc1f0-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6cbba93e-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6d30a80c-3a10-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/7668798e-654e-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-3										ONLINE	   0	 0	 0
		gptid/a4109ce9-1b69-11e7-b2b2-0cc47aa992a4	ONLINE	   0	 0	 0
		spare-1									   ONLINE	   0	 0	 0
		  gptid/c480a7ed-f12f-11e7-94cc-0cc47aa992a4  ONLINE	   0	 0	 0  (resilvering)
		  gptid/66a91655-6522-11e6-b9bf-0cc47aa992a4  ONLINE	   0	 0	 0
		gptid/69a77ecc-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6a1eb93e-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6a95c0c8-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/6b0e592f-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-4										ONLINE	   0	 0	 0
		gptid/b4559908-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/b4d3e877-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/b54c9828-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/b5c48081-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/b63e4972-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/b6c2242f-3a11-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-5										ONLINE	   0	 0	 0
		gptid/10db35e8-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/11581ec1-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/11d67465-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/1253bce9-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/12cd01fe-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/13457226-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-6										ONLINE	   0	 0	 0
		gptid/5046dd26-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/50c6eb44-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/5140952b-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/51ba7e22-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/523a391c-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/52bb6399-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-7										ONLINE	   0	 0	 0
		gptid/9b7f8b39-6534-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/1ca07863-7db4-11e7-86e9-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/37ac7176-d684-11e7-86e9-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/9fb5014d-a1d6-11e7-86e9-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/ef0a1e85-7ce0-11e7-86e9-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/8ca00cdd-3a12-11e6-800f-0cc47aa992a4	ONLINE	   0	 0	 0
	  raidz2-8										ONLINE	   0	 0	 0
		gptid/37203df0-648d-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/38593972-648d-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/39974ea7-648d-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/3ad4c7b1-648d-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/3c0d5651-648d-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
		gptid/3d4d3308-648d-11e6-b9bf-0cc47aa992a4	ONLINE	   0	 0	 0
	spares
	  5749079118928938505							 INUSE	 was /dev/gptid/66a91655-6522-11e6-b9bf-0cc47aa992a4
	  gptid/56c4990c-655b-11e6-b9bf-0cc47aa992a4	  AVAIL  
 
Joined
Jul 3, 2015
Messages
926
This was another system during rebuild of a hot-spare.

Code:
NAME											  STATE	 READ WRITE CKSUM
   tank											  DEGRADED	 0	 0	 0
	 raidz2-0										ONLINE	   0	 0	 0
	   gptid/dc389c2a-589c-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/2d0e139b-379a-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/2d789821-379a-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/2de1b76e-379a-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/2e4febcf-379a-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/970f5bf9-4e78-11e6-b91c-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-1										ONLINE	   0	 0	 0
	   gptid/0c19c57e-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/0c87bfc8-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/0cf6fef0-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/0d6167dc-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/a3024cd5-3c50-11e6-869b-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/0672a2e7-3c50-11e6-869b-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-2										ONLINE	   0	 0	 0
	   gptid/d3131676-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d3803ebb-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d3ed9919-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d45c61aa-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d4ca23d3-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d539b7a6-379b-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-3										DEGRADED	 0	 0	 0
	   gptid/abe7d00a-379c-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/ac5c5df3-379c-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   spare-2									   UNAVAIL	  0	 0	 0
		 3475867383828594346						 UNAVAIL	  0	 0	 0  was /dev/gptid/accc3ee7-379c-11e6-9ec4-0cc47aa992a6
		 gptid/94e2e7e2-6538-11e6-ad4d-0cc47aa992a6  ONLINE	   0	 0	 0
	   gptid/ad3c87a2-379c-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/adae9275-379c-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/ae1d1748-379c-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-4										ONLINE	   0	 0	 0
	   gptid/95bad3c0-654b-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/808c72b5-379d-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/80ff645f-379d-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/8176354c-379d-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/81e459d8-379d-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/8258a48e-379d-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-5										ONLINE	   0	 0	 0
	   gptid/52c4586a-379e-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/533ef168-379e-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/53b6d730-379e-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/5430a250-379e-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/54a508c7-379e-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/551aaea1-379e-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-6										ONLINE	   0	 0	 0
	   gptid/28442a55-379f-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/28c0f2dd-379f-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/293a4e75-379f-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/29bb0b44-379f-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/2a6e3818-379f-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/2af23948-379f-11e6-9ec4-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-7										ONLINE	   0	 0	 0
	   gptid/d72b6bc3-3c41-11e6-ad7b-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/e5734600-5500-11e7-ad29-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d8fa86bf-3c41-11e6-ad7b-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/21cb2223-7359-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/d9ff9fdb-3c41-11e6-ad7b-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/db19031b-3c41-11e6-ad7b-0cc47aa992a6	ONLINE	   0	 0	 0
	 raidz2-8										ONLINE	   0	 0	 0
	   gptid/4c57b647-6392-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/4cf2942a-6392-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/4d93eac5-6392-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/915d72bc-6536-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/4ed633bb-6392-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
	   gptid/4f77b5c3-6392-11e6-ad4d-0cc47aa992a6	ONLINE	   0	 0	 0
   spares
	 7354878102559310426							 INUSE	 was /dev/gptid/94e2e7e2-6538-11e6-ad4d-0cc47aa992a6
	 gptid/e23920cf-65e0-11e6-ad4d-0cc47aa992a6	  AVAIL  
 
Joined
Jul 3, 2015
Messages
926
So perhaps the wording 'faulted' in your case is just mirroring that of the failed drive as mine say unavailable.

However the errors you mention shown from the spare are worrying. Clearly something is wrong with the system be it HBA or cabling.
 
Last edited:

g4m3r7ag

Dabbler
Joined
Nov 25, 2017
Messages
31
Ok, the resilver completed successfully so my worry is partially done. So your saying I should have one SFF-8088 connection from my HBA to the chassis and have that chassis port connected to J0 on the front backplane and then have an SFF-8087 from J1 on the front backplane to J0 on the rear backplane and only maintain the one connection back to the HBA? I think I'm going to just order a new HBA and wire it up according to spec. If my faulted drive passes a long SMART then will issuing a zpool clear resilver it back into the mirror and drop the spare back to being a spare? Here is the current zpool status and volume status after resilver.

Code:
root@freenas:~ # zpool status
  pool: Media
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
		Sufficient replicas exist for the pool to continue functioning in a
		degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
		repaired.
  scan: resilvered 5.55T in 0 days 12:43:36 with 0 errors on Fri Sep 21 13:30:07 2018
config:

		NAME											  STATE	 READ WRITE CKSUM
		Media											 DEGRADED	 0	 0	 0
		  mirror-0										ONLINE	   0	 0	 0
			gptid/3e730f1a-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
			gptid/3fda3caa-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
		  mirror-1										DEGRADED	 0	 0	 0
			spare-0									   FAULTED	  0	 0	 0
			  gptid/413d518d-2fb5-11e8-b89e-d4ae52805daa  FAULTED	  6   230	 0  too many errors
			  gptid/49b536ed-2fb5-11e8-b89e-d4ae52805daa  ONLINE	   0	 0	 0
			gptid/42a1b153-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
		  mirror-2										ONLINE	   0	 0	 0
			gptid/4407b94c-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
			gptid/457be9ff-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
		  mirror-3										ONLINE	   0	 0	 0
			gptid/46edc9ba-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
			gptid/4845c8bc-2fb5-11e8-b89e-d4ae52805daa	ONLINE	   0	 0	 0
		spares
		  13635575311792425251							INUSE	 was /dev/gptid/49b536ed-2fb5-11e8-b89e-d4ae52805daa

errors: No known data errors

  pool: VMWare
 state: ONLINE
  scan: scrub repaired 0 in 0 days 01:03:16 with 0 errors on Sun Sep 16 06:03:16 2018
config:

		NAME											STATE	 READ WRITE CKSUM
		VMWare										  ONLINE	   0	 0	 0
		  mirror-0									  ONLINE	   0	 0	 0
			gptid/9bfcecf3-37e4-11e8-8a42-d4ae52805daa  ONLINE	   0	 0	 0
			gptid/9d4f42cc-37e4-11e8-8a42-d4ae52805daa  ONLINE	   0	 0	 0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:07 with 0 errors on Fri Sep 14 03:45:07 2018
config:

		NAME		STATE	 READ WRITE CKSUM
		freenas-boot  ONLINE	   0	 0	 0
		  mirror-0  ONLINE	   0	 0	 0
			da0p2   ONLINE	   0	 0	 0
			da1p2   ONLINE	   0	 0	 0

errors: No known data errors
root@freenas:~ #


freenas volume after resilver.PNG


Here is the console output of the errors that were occurring every couple hours during the resilver

freenas console errors.PNG
 
Joined
May 10, 2017
Messages
838
It can be wired both ways, it's supported by Supermicro, and the way you have now will have more bandwidth.
I think I'm going to just order a new HBA and wire it up according to spec.

It can be wired both ways, it's supported by Supermicro, and the way you have now will have more bandwidth.
 

g4m3r7ag

Dabbler
Joined
Nov 25, 2017
Messages
31
Thank you for confirming I thought I had wired it in a supported fashion but was beginning to wonder with the errors. I'm going to get a new HBA ordered so I can swap it out next week. Afterwards I will run an extended SMART on the FAULTED drive and if it passes will a zpool clear resilver it back into the mirror and return my spare back to being a spare?
 
Status
Not open for further replies.
Top