zpool faulted after power failure

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Hi all,

My Freenas server has been running for years without any issues but overnight the psu died and now I'm facing issues with the zpool.

Doing a zpool import gives me this
Code:
   pool: gDisk
	 id: 4321208912538017444
  state: FAULTED
 status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
		The pool may be active on another system, but can be imported using
		the '-f' flag.
   see: http://illumos.org/msg/ZFS-8000-72
 config:

		gDisk										   FAULTED  corrupted data
		  raidz1-0									  ONLINE
			gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83  ONLINE
			gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83  ONLINE
			gptid/3a5b5d0b-943e-11e4-8c0e-a01d48c76648  ONLINE
			gptid/bd2204b8-3024-11e4-beb2-6805ca1cb42a  ONLINE


Doing a zpool import -F gDisk
Code:
cannot import 'gDisk': I/O error
		Destroy and re-create the pool from
		a backup source.


However, once I do this I get the following message with the same 4 numbers repeating over and over.

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=13109049489029127203

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=4774203770015519164

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=9019238602065831635

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=11673891713223961018

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=13109049489029127203

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=4774203770015519164

Nov 17 09:43:59 freenas ZFS: vdev state changed, pool_guid=4321208912538017444 vdev_guid=9019238602065831635

Some info on the system, it's a gen8 HP micro server using Freenas 11.2 (upgraded too early) with 4 disks in raid-z. I have a lot of data there and anything I can recover would be already great. I'm conscious it may all be gone now.
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Just to mention, running smartctl doesn't give any error on any of the disks. I believe they are all in good conditions and they are not the issues.

The hardware seems all fine and I'm rooting for an issue with ZFS
 

Platter7

Dabbler
Joined
Sep 22, 2018
Messages
35
Do you have a backup / replicated snapshot? Also interested in the hardware you utilized, what kind of PSU and did you use ECC RAM?
I reckon its a raidz-1 pool?
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Do you have a backup / replicated snapshot? Also interested in the hardware you utilized, what kind of PSU and did you use ECC RAM?
I reckon its a raidz-1 pool?
Unfortunately no... I'm only learning now about snapshot :(

The PSU is a microPSU but the part that failed was the Leicke power supply. I bought a new one and the server started just like before. I've been using them since 2015 24/7 and no issue thus far.

My ram is this one and does have ECC
http://uk.crucial.com/gbr/en/proliant-microserver-gen8/CT7168757

It is a raid-z1 pool indeed
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Anyone ?

I’ve looked at a few things but I’m afraid of losing data before doing them. Any help would be greatly appreciated

Additional information, doing a zbd -l /dev/ada0 says that it can’t find any label but I have all the labels when I do zdb -l /dev/ada0p2. I’m not sure what to make of that.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Additional information, doing a zbd -l /dev/ada0 says that it can’t find any label but I have all the labels when I do zdb -l /dev/ada0p2. I’m not sure what to make of that.

FreeNAS actually partitions the drive for alignment and swap reasons, so it's expected to have no ZFS label on the raw disk itself - ada0p2 will be where it's at by design.

Can you paste the output of the zdb -l /dev/ada0p2 in [.CODE][./CODE] tags?

You may be able to roll back to an earlier transaction but a pool with damaged metadata is a nasty state to be in. :( You're probably looking at some manner of data loss.

Code:
zpool clear -F gDisk
zpool import gDisk <--this line will probably kick out a warning about "will result in some data loss"
zpool import -F gDisk


Edited, think I mixed up implementations
 
Last edited:

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Can you paste the output of the zdb -l /dev/ada0p2 in [.CODE][./CODE] tags?

Code:
------------------------------------
LABEL 0
------------------------------------
	version: 5000
	name: 'gDisk'
	state: 0
	txg: 34596237
	pool_guid: 4321208912538017444
	hostid: 2970101908
	hostname: 'freenas.local'
	top_guid: 16522616241267246162
	guid: 11673891713223961018
	vdev_children: 1
	vdev_tree:
		type: 'raidz'
		id: 0
		guid: 16522616241267246162
		nparity: 1
		metaslab_array: 31
		metaslab_shift: 36
		ashift: 12
		asize: 11993762234368
		is_log: 0
		create_txg: 4
		children[0]:
			type: 'disk'
			id: 0
			guid: 11673891713223961018
			path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 487
			create_txg: 4
		children[1]:
			type: 'disk'
			id: 1
			guid: 13109049489029127203
			path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 486
			create_txg: 4
		children[2]:
			type: 'disk'
			id: 2
			guid: 4774203770015519164
			path: '/dev/gptid/3a5b5d0b-943e-11e4-8c0e-a01d48c76648'
			whole_disk: 1
			DTL: 485
			create_txg: 4
		children[3]:
			type: 'disk'
			id: 3
			guid: 9019238602065831635
			path: '/dev/gptid/bd2204b8-3024-11e4-beb2-6805ca1cb42a'
			whole_disk: 1
			DTL: 484
			create_txg: 4
	features_for_read:
		com.delphix:hole_birth
		com.delphix:embedded_data
------------------------------------
LABEL 1
------------------------------------
	version: 5000
	name: 'gDisk'
	state: 0
	txg: 34596237
	pool_guid: 4321208912538017444
	hostid: 2970101908
	hostname: 'freenas.local'
	top_guid: 16522616241267246162
	guid: 11673891713223961018
	vdev_children: 1
	vdev_tree:
		type: 'raidz'
		id: 0
		guid: 16522616241267246162
		nparity: 1
		metaslab_array: 31
		metaslab_shift: 36
		ashift: 12
		asize: 11993762234368
		is_log: 0
		create_txg: 4
		children[0]:
			type: 'disk'
			id: 0
			guid: 11673891713223961018
			path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 487
			create_txg: 4
		children[1]:
			type: 'disk'
			id: 1
			guid: 13109049489029127203
			path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 486
			create_txg: 4
		children[2]:
			type: 'disk'
			id: 2
			guid: 4774203770015519164
			path: '/dev/gptid/3a5b5d0b-943e-11e4-8c0e-a01d48c76648'
			whole_disk: 1
			DTL: 485
			create_txg: 4
		children[3]:
			type: 'disk'
			id: 3
			guid: 9019238602065831635
			path: '/dev/gptid/bd2204b8-3024-11e4-beb2-6805ca1cb42a'
			whole_disk: 1
			DTL: 484
			create_txg: 4
	features_for_read:
		com.delphix:hole_birth
		com.delphix:embedded_data
------------------------------------
LABEL 2
------------------------------------
	version: 5000
	name: 'gDisk'
	state: 0
	txg: 34596237
	pool_guid: 4321208912538017444
	hostid: 2970101908
	hostname: 'freenas.local'
	top_guid: 16522616241267246162
	guid: 11673891713223961018
	vdev_children: 1
	vdev_tree:
		type: 'raidz'
		id: 0
		guid: 16522616241267246162
		nparity: 1
		metaslab_array: 31
		metaslab_shift: 36
		ashift: 12
		asize: 11993762234368
		is_log: 0
		create_txg: 4
		children[0]:
			type: 'disk'
			id: 0
			guid: 11673891713223961018
			path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 487
			create_txg: 4
		children[1]:
			type: 'disk'
			id: 1
			guid: 13109049489029127203
			path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 486
			create_txg: 4
		children[2]:
			type: 'disk'
			id: 2
			guid: 4774203770015519164
			path: '/dev/gptid/3a5b5d0b-943e-11e4-8c0e-a01d48c76648'
			whole_disk: 1
			DTL: 485
			create_txg: 4
		children[3]:
			type: 'disk'
			id: 3
			guid: 9019238602065831635
			path: '/dev/gptid/bd2204b8-3024-11e4-beb2-6805ca1cb42a'
			whole_disk: 1
			DTL: 484
			create_txg: 4
	features_for_read:
		com.delphix:hole_birth
		com.delphix:embedded_data
------------------------------------
LABEL 3
------------------------------------
	version: 5000
	name: 'gDisk'
	state: 0
	txg: 34596237
	pool_guid: 4321208912538017444
	hostid: 2970101908
	hostname: 'freenas.local'
	top_guid: 16522616241267246162
	guid: 11673891713223961018
	vdev_children: 1
	vdev_tree:
		type: 'raidz'
		id: 0
		guid: 16522616241267246162
		nparity: 1
		metaslab_array: 31
		metaslab_shift: 36
		ashift: 12
		asize: 11993762234368
		is_log: 0
		create_txg: 4
		children[0]:
			type: 'disk'
			id: 0
			guid: 11673891713223961018
			path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/db835a9b-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 487
			create_txg: 4
		children[1]:
			type: 'disk'
			id: 1
			guid: 13109049489029127203
			path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			phys_path: '/dev/gptid/dc882ef3-665a-11e2-b37c-a0b3cce25a83'
			whole_disk: 1
			DTL: 486
			create_txg: 4
		children[2]:
			type: 'disk'
			id: 2
			guid: 4774203770015519164
			path: '/dev/gptid/3a5b5d0b-943e-11e4-8c0e-a01d48c76648'
			whole_disk: 1
			DTL: 485
			create_txg: 4
		children[3]:
			type: 'disk'
			id: 3
			guid: 9019238602065831635
			path: '/dev/gptid/bd2204b8-3024-11e4-beb2-6805ca1cb42a'
			whole_disk: 1
			DTL: 484
			create_txg: 4
	features_for_read:
		com.delphix:hole_birth
		com.delphix:embedded_data
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Code:
zpool clear -F gDisk
zpool import gDisk <--this line will probably kick out a warning about "will result in some data loss"
zpool import -F gDisk


Edited, think I mixed up implementations

Thank you for your help. As long as I can recover something I'm happy.

The result of the code gives me the same result:
Code:
[root@freenas ~]# zpool clear -F gDisk
cannot open 'gDisk': no such pool
[root@freenas ~]# zpool import gDisk
cannot import 'gDisk': I/O error
		Destroy and re-create the pool from
		a backup source.
[root@freenas ~]# zpool import -F gDisk
cannot import 'gDisk': I/O error
		Destroy and re-create the pool from
		a backup source.
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
One thing I've noticed is that the drive are not associated with the pool.

Could that explain the issue and can I reattach them to the pool?
 

Attachments

  • Screenshot 2018-11-24 at 08.44.34.png
    Screenshot 2018-11-24 at 08.44.34.png
    1.1 MB · Views: 487

styno

Patron
Joined
Apr 11, 2016
Messages
466

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
Not sure if you're still looking for a solution, but given the vdev changed notification, this may indicate your controller (SATA/SAS?) actually keeps connecting/disconnecting or maybe some other cabling or communication issue between the drives and the host. The power surge could've blown all your drives at once (if they're the same brand, a brownout on the 12V or 5V line may have blown the electronics) or your power supply could simply have started to provide unstable power due to the power outage.

Never trust the SMART status to tell you much if anything valuable. If you know from before you have a proper functioning SMART system, then you could look at the stats and see if the error numbers are increasing/decreasing but I've rarely seen SMART report a proper error before or after a failure.

My first suggestion would be attempt to import the disks in a completely different system. You can import it using 4 external hard drive enclosures on a laptop if you need to.

How often do you scrub the pool? Is it possible you never done it and there are actually multiple undetected issues across multiple disks? If the system never reads the files, then ZFS won't detect an error and won't heal properly. With RAIDZ2 you may be able to get away with it but I just don't trust hardware enough. Given that modern desktop drives (10TB) are rated to spit out at least one undetected error for every 10TB read you're statistically bound to have an issue.

RAIDZ or ZFS is no guarantee for hardware failures. Restore from a backup if it comes to it. Also, it would be helpful to know a bit more about the hardware, more context from the logs and the actual full SMART output to make a proper diagnosis.
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
Not sure if you're still looking for a solution, but given the vdev changed notification, this may indicate your controller (SATA/SAS?) actually keeps connecting/disconnecting or maybe some other cabling or communication issue between the drives and the host.

My first suggestion would be attempt to import the disks in a completely different system. You can import it using 4 external hard drive enclosures on a laptop if you need to.
Interesting. Can you tell me more on how to import the drives? Should I just boot into freenas from a laptop?

I could buy two of those to connect them to the laptop
https://www.amazon.co.uk/TeckNet-Do...qid=1546965861&sr=8-5&keywords=hard+disk+dock
 

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
Hi,

You can boot FreeNAS onto a VM onto a laptop if that's what you have. You can then have the USB drives hooked to the VM and then run zpool import
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
I’ve purchased a dock to be able to connect the drive to the laptop. I’ll revert back once I’ve tried it.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
just boot into freenas from a laptop?
You can do it. Just make sure you don't run the wizard - IIRC it could easily erase the disks' partition tables or sth.

Sent from my phone
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
So I've installed freenas on the laptop using a VM and I still get the same errors as before. So it's not the hardware of the NAS it seems.
 

antipop

Dabbler
Joined
Jan 16, 2013
Messages
21
I’ve been trying commands with zbd but I can’t seem to get any success. When I check the labels it’s all there but whenever I try to access the database I get the I/O issue and I still don’t know where that could come from.
How can i rule out the disks ?
 
Top