ZFS pool RW import causes reboot

Status
Not open for further replies.

Cyderic

Cadet
Joined
Oct 30, 2016
Messages
3
Hi together,

I know there are already many different threads regarding reboot on pool imports but my situation is a bit different.

First of all my system:
Build
FreeNAS-9.10.1-U2 (f045a8b)
Platform AMD Athlon(tm) II X3 450 Processor
Memory 8124MB (non-ECC)
HDD: 8x 4TB RED WD40EFRX

Short story:
I'm not able to import my zfs pool anymore since I started a scrub that caused a system reboot (please see pre story below why I try to import).
Everytime I try to import the pool. hundreds of messages are scrolling through the screen - partly overlapping each other (it looks very weird) - and finally the screen gets blank and the system is rebooting.
The systems comes back again but without the pool.
I also tried to upload my last config backup to a fresh installed freenas. After the config-import reboot, the system hangs in a "kdb panic state"

What I can do is to mount my pool in read only state. Everything is fine then.
zpool import -f -o readonly=on pool01​

What can I do?


Here are some more infos:
Code:
[root@freenas ~]# zpool import												
   pool: pool01																
	 id: 12316858264337946													
  state: ONLINE																
status: The pool was last accessed by another system.						
action: The pool can be imported using its name or numeric identifier and	
		the '-f' flag.														
   see: http://illumos.org/msg/ZFS-8000-EY									
config:																		
																				
		pool01										  ONLINE				
		  raidz1-0									  ONLINE				
			gptid/a4340e86-fe3e-11e4-86b6-3085a93c9a92  ONLINE				
			gptid/50c291bd-095f-11e5-bd5a-3085a93c9a92  ONLINE				
			gptid/36980e9b-279c-11e5-96b6-3085a93c9a92  ONLINE				
			gptid/e4d99054-04a3-11e6-89a9-3085a93c9a92  ONLINE				
		  raidz1-1									  ONLINE				
			gptid/985ad909-9e3c-11e6-8250-3085a93c9a92  ONLINE				
			gptid/f75484c6-9dab-11e6-a69a-3085a93c9a92  ONLINE				
			gptid/cde8b704-9d40-11e6-819e-3085a93c9a92  ONLINE				
			gptid/8fac6843-9c65-11e6-97c1-3085a93c9a92  ONLINE				
[root@freenas ~]#	


Code:
[root@freenas ~]# zpool import -f -o readonly=on pool01 mnt													
[root@freenas ~]# zpool status mnt																								
  pool: mnt																														
state: ONLINE																													
  scan: scrub in progress since Sun Oct 30 03:04:09 2016																			
		9.04G scanned out of 13.6T at 1/s, (scan is slow, no estimated time)														
		0 repaired, 0.06% done																									
config:																															
																																	
		NAME											STATE	 READ WRITE CKSUM												
		mnt											 ONLINE	   0	 0	 0												
		  raidz1-0									  ONLINE	   0	 0	 0												
			gptid/a4340e86-fe3e-11e4-86b6-3085a93c9a92  ONLINE	   0	 0	 0												
			gptid/50c291bd-095f-11e5-bd5a-3085a93c9a92  ONLINE	   0	 0	 0												
			gptid/36980e9b-279c-11e5-96b6-3085a93c9a92  ONLINE	   0	 0	 0												
			gptid/e4d99054-04a3-11e6-89a9-3085a93c9a92  ONLINE	   0	 0	 0												
		  raidz1-1									  ONLINE	   0	 0	 0												
			gptid/985ad909-9e3c-11e6-8250-3085a93c9a92  ONLINE	   0	 0	 0												
			gptid/f75484c6-9dab-11e6-a69a-3085a93c9a92  ONLINE	   0	 0	 0												
			gptid/cde8b704-9d40-11e6-819e-3085a93c9a92  ONLINE	   0	 0	 0												
			gptid/8fac6843-9c65-11e6-97c1-3085a93c9a92  ONLINE	   0	 0	 0												
																																	
errors: No known data errors																										
[root@freenas ~]#

I can list my files beneath "mount" in this state so I would say my pool is not corrupted / destroyed or whatever.

Pre Story so far:
This system runs smoothly since years with 8 x 2TB green HDDs. I also had some drive failures in the past but zfs could handle it after replacing the failed disks.

Across the last weeks I started to expand this pool in a way that I replaced one drive after another with a 4TB RED one.

Today I replaced the last one succesfully and the resilver process also finished without errors.

After that I could expand the pool and everything was fine. Access was possible.


And then...I made something bad obviously: Still with Freenas 9.3 I started the scrub process. After some minutes I realised that the system is rebooting. However it didn't come back because it hanged everytime in a "kdb panic" error.

So I thought to try it with the latest freenas system freshly installed. With the fresh installed system I tried to import the volume but (see short story above)
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
  1. Add more RAM, if possible.
  2. Did you burn in the new disks?
  3. Are you sure your PSU can handle the increased load?
  4. Post the output of smartctl -a for each disk (in CODE tags).
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It looks a lot like the pool is hosed.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Run a memory test on the hardware, just to rule that out.

After that, you transfer all your data out of that pool and re-create it from scratch.
 

Cyderic

Cadet
Joined
Oct 30, 2016
Messages
3
  1. Add more RAM, if possible.
  2. Did you burn in the new disks?
  3. Are you sure your PSU can handle the increased load?
  4. Post the output of smartctl -a for each disk (in CODE tags).
Thanks for this reply. Really aprecciate that one.


Today I swapped the RAM against 3 x 8GB ECC RAM. Nothing changed

No I didn't burn in the disks.

Yes the PSU can handle the load.

smart doesn't show any errors.


Guys, I don't understand this. I can mount it RO but not writeable anymore. This is weird.
Sorry but ZFS is designed to be reliable and robust and shouldn't (at least imho) crash because of a random reboot.

Why should I even use ZFS when such a dramatical thing can happen? Don't get me wrong... I know that nothing can "replace" a backup. This is totally clear.

However, in this situation nothing really bad happened like electric issues, random drive failures or something like that. It was just a "random restart". May caused by a RAM issue but I really can't believe that this can destroy the whole pool. One of the good things of ZFS is the fast "rebuild" time against classical RAID solutions. But now I think this is a joke. In my case I would have to transfer 10TB out of a backup system. This will take much longer than a rebuild - even in enterprise environments

It is just sooooo disappointing to me. I don't know why I should continue to recommend ZFS to my business customers anymore.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Sorry but ZFS is designed to be reliable and robust and shouldn't (at least imho) crash because of a random reboot.
Your timeline is wrong. What probably happen is that serious corruption happened. When ZFS comes across it, it panics the kernel because there's nothing safe it can do.
It was just a "random restart".
Again, probably due to a kernel panic.
May caused by a RAM issue but I really can't believe that this can destroy the whole pool.
You'd better believe it.
One of the good things of ZFS is the fast "rebuild" time against classical RAID solutions. But now I think this is a joke.
None of that has anything to do with the current problem.
In my case I would have to transfer 10TB out of a backup system. This will take much longer than a rebuild - even in enterprise environments
If it bothers you, use ECC RAM on recommended hardware and don't use RAIDZ1.
It is just sooooo disappointing to me. I don't know why I should continue to recommend ZFS to my business customers anymore.
Again, you did not follow the recommendations, so it's rather unfair to blame ZFS. If you ruin your car's engine by running it with some slop made out of whatever used oils you could find, are you going to go around telling people your car sucks, despite not having used proper fuels?
 

Cyderic

Cadet
Joined
Oct 30, 2016
Messages
3
Thanks for your response Ericloewe

Your timeline is wrong. What probably happen is that serious corruption happened. When ZFS comes across it, it panics the kernel because there's nothing safe it can do.
Well I thought ZFS is designed to detect that and fix it. For what do I run the scrub every second week?
It just feels so weak to me atm, sorry.

Your timeline is wrong. What probably happen is that serious corruption happened. When ZFS comes across it, it panics the kernel because there's nothing safe it can do.
If it bothers you, use ECC RAM on recommended hardware and don't use RAIDZ1.

Ok, what would you recommend on a 8x 4TB system?

Since I don't have any options left I will start from scratch again.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
I agree; it doesn't seem that this should happen.

However, if you assume corruption in the spacemap in RAM, which then gets to disk as 'valid' metadata, it does make sense that very bad things will happen. My understanding is that readonly mode doesn't access the spacemaps, because they are only needed to find free space for writing.

Some explanations from ZFS developers would be very useful to understand what might be happening and if there are any ways to reduce it or detect it earlier. For instance, an invalid spacemap could just be considered permanently 100% full and the pool continues operating normally.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Well I thought ZFS is designed to detect that and fix it.
Yes, but ZFS is not magic.
However, if you assume corruption in the spacemap in RAM, which then gets to disk as 'valid' metadata, it does make sense that very bad things will happen. My understanding is that readonly mode doesn't access the spacemaps, because they are only needed to find free space for writing.
Interesting theory that explains what might have happened.
For instance, an invalid spacemap could just be considered permanently 100% full and the pool continues operating normally.
"Operating normally" would be "read only", as literally no changes could be made. Not too dissimilar to the current situation. Might actually be exactly what's happening - vdev write fails, kernel panics.
Some explanations from ZFS developers would be very useful to understand what might be happening
Yeah, it would be fascinating.

Ok, what would you recommend on a 8x 4TB system?
RAIDZ2, it's much safer.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
ZFS is very, very good, but it isn't magic. We don't know what happened to your pool, whether it was a hardware problem, or a bug in ZFS, or some other layer of the FreeNAS software, we can only speculate. However, we do know that you can still access your data, despite whatever disaster happened, by importing the pool read-only. That's a pretty impressive outcome in my book.
smart doesn't show any errors
I'm still curious to see those smartctl outputs if you feel like posting them.

EDIT: we also know that, at some point, you were running a 32TB pool on 8GB of non-ECC RAM.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Guys, I don't understand this. I can mount it RO but not writeable anymore. This is weird.
Sorry but ZFS is designed to be reliable and robust and shouldn't (at least imho) crash because of a random reboot.
If you can mount RO, but not RW (which causes a crash), then my guess is you have corrupt pool metadata. Maybe it's from non error correcting RAM, maybe it's from a pool version upgrade during the middle of a scrub. I don't know what caused it, but that's what it smells like to me.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
"Operating normally" would be "read only", as literally no changes could be made. Not too dissimilar to the current situation. Might actually be exactly what's happening - vdev write fails, kernel panics.

I think it would be read-write. A pool is usually broken up into 200 metaslabs. Each one has a separate spacemap. So, if the current spacemap is bad, just stop using that spacemap for writes. 99.5% of the spacemaps would still be available.

This presumes that a 'bad' spacemap can even be detected.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
This presumes that a 'bad' spacemap can even be detected.
Well, finding a block where there should be none should handle that.
I think it would be read-write. A pool is usually broken up into 200 metaslabs. Each one has a separate spacemap. So, if the current spacemap is bad, just stop using that spacemap for writes. 99.5% of the spacemaps would still be available.
Possibly, but I'd probably stop trusting all spacemaps, since something is very wrong.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Well, finding a block where there should be none should handle that.
That is actually harder to determine than you would think. Spacemap is the only record, and the only way to know if you have a valid block (vs an abandoned block) is to walk the entire metadata tree looking for a reference. Think about those multi-hour de-dup pool imports for an idea of what that means.

Ericloewe said:
Possibly, but I'd probably stop trusting all spacemaps, since something is very wrong.
Maybe; that is where some ZFS developer analysis would be nice. If it is survivable, and warnable after the first, it might improve reliability. It is a fair assumption that for every person who manages to recover their pool, there are a couple more who have no idea what has happened.
 
Status
Not open for further replies.
Top