ESXi Datastore offline

Status
Not open for further replies.

Swarfega

Cadet
Joined
Aug 11, 2011
Messages
7
Hi all,

Test environment is having issues. My FreeNAS is a virtual machine used to present storage to two ESXi Hosts. My son managed to power off my host so down went the FreeNAS install. I have two iSCSI shares. One is 500GB the other is about 100GB. The 500GB share is offline. The storage path is showing two paths now, one is dead another alive.
This happened the other day (when I upgraded to 9.1.1 on another FreeNAS install) but at the time there were no virtual machines on the datastore so I just recreated the iSCSI targets which seemed to work.
Where abouts are the logs stored for the iSCSI targets?

I can restore from backup but these are a few weeks old and liking a challenge I would like to know what the cause is :)

This is a straight disc based LUN so no ZFS.

Shutting down the VM I see the messages:

GEOM: da1: the secondary GPT table is corrupt or invalid.
GEOM: da1: using the primary only -- recovery suggested
GEOM: label/extent_da1: corrupt or invalid GPT detected
GEOM: label/extent_da1: GPT rejected -- may not be recoverable

Again it said this the other day on the other VM. The da1/da2 are VMDKs!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So, really, there's a reason I wrote this long message about not running FreeNAS as a VM.

So you have FreeNAS as a VM, and da1 and da2 are VMDK's.

That means ESXi's I/O layers are sitting in between your FreeNAS and the storage.

It is not unusual for ESXi to totally trash VMDK's when power is lost unexpectedly.

What you're seeing is the result of your choice to stick ESXi between FreeNAS and the hardware. This shouldn't be done. There are a number of scenarios that can cause you to lose data. This is just another one, really more of an edge case. FreeBSD is smart enough to detect that something is seriously amiss, and reports that, but the real problem here is the design choices. Do not use VMDK's. Do not use RDM. Preferably don't run FreeNAS as a VM at all, though I'm convinced that if you avoid the landmines, there are ways to do so safely.
 

Swarfega

Cadet
Joined
Aug 11, 2011
Messages
7
This is a home test environment. If I had masses of money to spare I would have hardware storage, but I don't. I already know the risks so please spare me. I don't care that its lost data, I have backups. I just want to know why its done what its done.

Can anyone advise diagnosing FreeNAS?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
This isn't a FreeNAS problem. ESXi took your VMDK - and would I be correct in guessing a thin provisioned one, since those seem to be most prone to this - and scrambled the far end of it in some way. There's nothing all that complicated to diagnose, except perhaps the entertainment value of seeing if the disk label is floating around out there at some other location, which might provide some insight as to the manner of the corruption. You can Google various combinations of "ESXi" and "VMDK" and "corruption" and "lost power".

From a FreeNAS perspective, if the data stored on a hard disk radically changes or undergoes reality-warping effects, that's a hardware problem and we'll be happy to spend some time advising you how to fix your hardware. In this case, your storage "hardware" is actually VMware's software, which is known to behave badly on unexpected power loss or any of a dozen other things.

So the diagnosis is "you ran FreeNAS on top of ESXi and then did something very bad to ESXi. ESXi has damaged your VMDK's. FreeNAS noticed and was alarmed by this. Restore from backup and don't do this again." What the heck else could the diagnosis possibly be?

I hear you with the "masses of money" thing. But you got into a Yugo, didn't put the seat belt on, and had the misfortune to get into an accident. I have invested lots of time carefully guiding people away from doing this sort of thing, because we see a constant stream of people in tears come in here who have lost data. Be thankful you have backups.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
It's exactly as jgreco said. The diagnosis is "your data in the VM is gone". The treatment(if you will) is "don't put FreeNAS in a VM and expect the VMDK's to randomly trash your data for you".

Just like jgreco said. Everyone hears the arguments that money is tight, but that doesn't change the reality that it just doesn't work right and there's no fix. Don't like that answer, feel free to spend more money and go with a VM that uses VT-d technology. ;)

We can't all have our cake and eat it too.
 

Swarfega

Cadet
Joined
Aug 11, 2011
Messages
7
I've never actually had issues with vmdk's in ESX other than now twice with FreeNAS. One clean shutdown the other not.
I'm aware virtualising FreeNAS is not recommended however it works fine for what I need it for. I just felt seeing two instances of the same issue is worth mentioning.

Oh and btw they are Thick Eager Zeroed discs not thin.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, right. You probably won't have issues with VMDK's in ESXi as long as they're used correctly. When you do something nasty, though, like an in-use datastore becomes unavailable ("unplugged the drive", "SAN failure", etc) then all bets are off.

A lot of systems would not even detect that something is amiss. Or worse, if the VM was to stay running and the underlying storage rebooted and lost some writes, you can run into the horrifying situation of at some point later the VM crashes because what's on the VMDK isn't what the VM thought, and you can end up with a totally hosed filesystem.

But really this is probably not a FreeBSD thing. You'd have to look at your ESXi system to see why bits were winding up (or not winding up) in places FreeBSD wasn't expecting them. Generally speaking, you need to do all the right things with ESXi... RAID controllers with battery backed write cache, redundancies to make sure bad things don't actually happen to the VM and storage environment, etc. It is very typical to hear someone run ESXi and say "but I couldn't afford all that expensive crap, so I cheaped out and got a ServeRAID M5014 and hacked ESXi to enable write-back." Or something ghetto like that. And then when something inevitably goes wrong, they'll get all up in arms about how it must be their FreeNAS VM that's at fault, not their choice of storage.

Generally speaking, those of us who successfully run ESXi have seen dozens of people pass through here who have unsuccessfully run ESXi. We hear people say

one is dead

and then

it works fine for what I need it for

and then if you're me, you realize you have nothing in common with someone who is willing to deploy something that is designed to fail, and then actually fails, but is still deemed to "work fine." I'm very sorry, but I cannot help you.
 

Swarfega

Cadet
Joined
Aug 11, 2011
Messages
7
It doesn't matter now, I trashed the virtual discs and recreated them. I will continue to use FreeNAS for my shared storage and I am fully aware that its a risk which is why I backup my test environment.
 
Status
Not open for further replies.
Top