FreeNas hangs when zpool goes unavaialbe

Status
Not open for further replies.

rodfantana

Dabbler
Joined
Jun 10, 2017
Messages
27
I have a test FreeNas setup on a VM. FN is installed on a 8G vmdk. A LSI 9207-8i is in passthrough mode presented to the VM, that makes 4x500g disks avail for a raidz1 zpool. All works as expected. Until i started testing drive failures scenarios.

1. When I pull one of the 500gb drives out, everything continues to work.
2. As soon as I pull the second 500gb drive out, the VM hangs where I can still run commands from an established SSH session, but can't establish new ones, or get anywhere within webUI. Logins to webUI are also not happening at this point. I also tried just writing a file via 'touch /var/log/blah' which did not return the prompt; hence failed.

I thought it was a swap striping issue, so i 'swapoff'ed all of the swap on the zpool disks and pulled them again. Same result.

I understand that I can't run a 4-drive raidz1 with only 2 drives, but would like to know why the FS OS is affected, when it's spun on a completely separate drive.

Thanks in advance.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I have a test FreeNas setup on a VM. FN is installed on a 8G vmdk. A LSI 9207-8i is in passthrough mode presented to the VM, that makes 4x500g disks avail for a raidz1 zpool. All works as expected. Until i started testing drive failures scenarios.

1. When I pull one of the 500gb drives out, everything continues to work.
2. As soon as I pull the second 500gb drive out, the VM hangs where I can still run commands from an established SSH session, but can't establish new ones, or get anywhere within webUI. Logins to webUI are also not happening at this point. I also tried just writing a file via 'touch /var/log/blah' which did not return the prompt; hence failed.

I thought it was a swap striping issue, so i 'swapoff'ed all of the swap on the zpool disks and pulled them again. Same result.

I understand that I can't run a 4-drive raidz1 with only 2 drives, but would like to know why the FS OS is affected, when it's spun on a completely separate drive.

Thanks in advance.
@m0nkey_ , @Ericloewe and I, are all in the mumble server. Eric thinks it's because your system dataset is also coming offline presumably (it's presumably on the pool that you killed). That's probably the issue. You ain't going anywhere if the system dataset comes offline :)
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The .system dataset gets automagically moved to the first pool setup by the admin, to avoid wearing out USB flash drives. It can be moved back to the boot pool, but you should only do that if you have a reliable boot pool (single SSD at the very minimum, mirrored SSDs recommended).
 

rodfantana

Dabbler
Joined
Jun 10, 2017
Messages
27
Ok, thanks for a quick reply. That makes sense! the datastore that the boot vmdk sits on is a single spindle, so not super reliable, but enough for these purposes. I am not anticipating to add more spindles to this pool, i.e. it's most likely going to continue being a 4-drive raidz1, and the only space avail to this FN instance. The 4x500g drives will be swapped out with 4x6TB drives, but the rest will stay the same. So running with -2 drives is really not a good scenario. Do you see any value in relocating .system back to the 8g vmdk in this case?
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Well, your question was, "when the main system pool goes down, I lose my system", and I believe we've answered why, at least it is a compelling explanation.

Most of us have our .system dataset on the main pool, and we see to it that the main pool is not in danger of going down :) So for us, your "test" case here is a little fanciful.

I think in general we would not recommend putting the system pool back on the boot device. But, once you are virtualizing things, you're really on your own. There are too many variables, too many different levels of skill out there, for us to really give very specific support for virtualized scenarios. Just be careful.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I think in general we would not recommend putting the system pool back on the boot device.
I can see pros and cons of that. On the one hand, presumably the .vmdk that's being used for the FreeNAS boot volume is on a device that can tolerate a reasonable amount of I/O--it's not likely to fall over and die in short order the way a USB stick can (which, I expect, is the reason the .system dataset is moved to the pool as soon as a pool is created). OTOH, it's only 8 GB, and .system can take a bit of space. On the gripping hand, if .system is on your boot device, and your boot device dies, there go your config backups.
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
Yep, just mirror a couple .vmdks as the boot devices and put the system dataset there if you like. If you want more redundancy, put the .vmdks on separate datastores.
 
Status
Not open for further replies.
Top