FreeNAS 8.2.0 BETA 3 sees 3TB ESXi 5 RDM as 0MB

Status
Not open for further replies.

oswa

Cadet
Joined
Apr 10, 2013
Messages
6
Good luck with virtualization and raw device mapping.
It's not recommended for a reason. Freenas likes raw drives to play with, not drives being mapped by esxi.
Even with vt-d passing through the entire controller card, I'd still be wary of virtualizing freenas for production.
Kind of sad state of affairs. For me it''ll mean I remove FreeNAS, not virtualization setup.
ESXi and raw device mapping is a pretty solid technology that works for enterprise usage/SAN configs to allow pass-through multi-pathing to guests.

Oh, and FreeNAS has been rock stable in this setup with RDMP since I installed it when 8.0.1 came out.
I'd bet that this is actually pretty minor issue/bug, but I don't have the skillset to hunt it down on FreeBSD/FreeNAS.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Maybe someone else can chime in that knows more about RDM and how it relates to freenas.

I would guess though, that in your example, the SAN system is running on bare metal, and is presenting a LUN to esxi. Esxi then presents the lun through rdm to the vm. In my mind, that's a high level rdm case.

You're trying to use rdm as low level access to freenas. That's totally different. The SAN would have low level access to it's storage. After the san does it's thing, you involve rdm up to the vm level.

I in fact use RDM under ESXi. But not to present freenas with disks. I have physical disks in the esxi host that I (individually) present to the vm's through rdm. Not ideal, as the vm's don't have access to smart. But I accept the risk. Nothing critical is running under those vm's.

To use rdm the way I thought it was intended to be used, would be to have a SAN backend doing all the low level stuff, with the san having direct access to it's own disks. Then the san handles the lun as esxi sees it. RDM from there makes sense. The san is handling all the gory depths of disk management, etc. I'll bet the san isn't accessing it's disks through rdm. I think it has direct access.

The only semi-sane way to virtualize freenas is with vt-d and direct io passthrough. You let freenas have direct access to the whole sata / sas controller. Since the freenas vm is accessing the disk controller directly, it has low level access to the disks. It's been pointed out that vt-d support in prosumer hardware is sometimes not all there compared to vt-d on server grade hardware.

Except for testing purposes, I run freenas on bare metal. I have two freenas machines, a primary, and a backup / replication target. I have one other machine running esxi with most of the vm's accessing local storage through rdm.
 

oswa

Cadet
Joined
Apr 10, 2013
Messages
6
Maybe someone else can chime in that knows more about RDM and how it relates to freenas.

I would guess though, that in your example, the SAN system is running on bare metal, and is presenting a LUN to esxi. Esxi then presents the lun through rdm to the vm. In my mind, that's a high level rdm case.

You're trying to use rdm as low level access to freenas. That's totally different. The SAN would have low level access to it's storage. After the san does it's thing, you involve rdm up to the vm level.

I in fact use RDM under ESXi. But not to present freenas with disks. I have physical disks in the esxi host that I (individually) present to the vm's through rdm. Not ideal, as the vm's don't have access to smart. But I accept the risk. Nothing critical is running under those vm's.

To use rdm the way I thought it was intended to be used, would be to have a SAN backend doing all the low level stuff, with the san having direct access to it's own disks. Then the san handles the lun as esxi sees it. RDM from there makes sense. The san is handling all the gory depths of disk management, etc. I'll bet the san isn't accessing it's disks through rdm. I think it has direct access.

The only semi-sane way to virtualize freenas is with vt-d and direct io passthrough. You let freenas have direct access to the whole sata / sas controller. Since the freenas vm is accessing the disk controller directly, it has low level access to the disks. It's been pointed out that vt-d support in prosumer hardware is sometimes not all there compared to vt-d on server grade hardware.

Except for testing purposes, I run freenas on bare metal. I have two freenas machines, a primary, and a backup / replication target. I have one other machine running esxi with most of the vm's accessing local storage through rdm.
Thank you for your response, I appreciate it.
In the SAN use case a disk/LUN is exposed from the array to the hypervisor thru fibre-channel.
This should effectively be the same as a direct attach disk (in theory).
Now I'll admit that reality and theory are sometimes different :)
The LUN is then shared through RDM to individual VM's the same way as a locally attached disk would be so there really shouldn't be any difference here.

However, you make a good point on the difference between a virtual disk RDM and physical RDM (vmkfstools -r vs -z).
This is also where I'm seeing a difference in behavior from FreeNAS.
"Virtual" RDM is what I have been using for a couple of years with FreeNAS and it has worked fine, but it presents a "VMware disk" to the guest and has a 2TB limit.
The physical/passthrough RDM that I needed to switch to now to get support for 3TB disks presents the "real" disk and allows SMART etc to work, but does NOT work with FreeNAS due to the geometry problems.

I will say though that this is a FreeNAS/FreeBSD problem, and not a VMWare problem.
I'm currently testing Nexenta on the same rig, same disk and it works fine.
So it looks like my short term solution will be to migrate off FreeNAS to Nexenta.
I would much rather not have to migrate my data, but I also need to resolve my redundancy problem (and grow my capacity) quickly.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
As has been said many times in the forums.... I'll repeat it again.

If you are using RDM everything works great.. until it doesn't. And when it doesn't, things go bad VERY quickly. Then you lose all of your data. If you have a failure with RDM with ZFS don't expect the VMWare/ZFS gods in the forum to help. They've abandoned all hope because people don't listen(because this has been explained in excruciating detail why RDM is stupid for ZFS). They've gotten tired of hashing the whole story out about 18 months ago and they're tired of people not doing their homework and understand what they're doing and blindly think they have some great novel plan for ESXi and ZFS.

If you aren't using ESXi's pci passthrough feature you shouldn't be using ZFS for a whole bunch of reasons related to ZFS expecting direct disk access only. See above for why you should be using PCI passthrough instead.
 

oswa

Cadet
Joined
Apr 10, 2013
Messages
6
As has been said many times in the forums.... I'll repeat it again.

If you are using RDM everything works great.. until it doesn't. And when it doesn't, things go bad VERY quickly. Then you lose all of your data. If you have a failure with RDM with ZFS don't expect the VMWare/ZFS gods in the forum to help. They've abandoned all hope because people don't listen(because this has been explained in excruciating detail why RDM is stupid for ZFS). They've gotten tired of hashing the whole story out about 18 months ago and they're tired of people not doing their homework and understand what they're doing and blindly think they have some great novel plan for ESXi and ZFS.

If you aren't using ESXi's pci passthrough feature you shouldn't be using ZFS for a whole bunch of reasons related to ZFS expecting direct disk access only. See above for why you should be using PCI passthrough instead.
Fair enough, that recommendation didn't exist when I set this environment up, and it's been chugging along nicely over the last 2yrs.

Quite honestly, RDM isn't the culprit here, but it's clearly not worth having that conversation so I'll move along and move off FreeNAS.
It's a hassle, but quite doable at this point.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Fair enough, that recommendation didn't exist when I set this environment up, and it's been chugging along nicely over the last 2yrs.

Technically, if you understood how RDM works and how ZFS works(going back to v1 of ZFS) then you'd have figured out on your own that RDM on ZFS = bad back in 2005. Even if you read the wikipedia article on ZFS or the manual you'd know bare metal is best(both mention this). The real issue is that you had a knowledge gap you didn't know about. Nothing new on this forum, a lot of people don't know a lot of stuff they didn't realize they didn't know. That's the whole freakin' problem. People don't know what they don't know and don't want to do the required homework to figure out what they don't know.

Quite honestly, RDM isn't the culprit here, but it's clearly not worth having that conversation so I'll move along and move off FreeNAS.
It's a hassle, but quite doable at this point.

Are you are willing to bet money on that?

I am trying to replace an older 2TB drive that just died with a new Seagate 3TB (STBD3000100) and after recreating the RDM mapping the VM (FreeNAS) throws the "unsupportable block size 0" for that device.

Never seen that error before.. except for people that use VMs. Still sure its a FreeNAS problem?

The main difference here is that the older 2TB was mapped as a virtual device (vmkfstools -r ... ) vs the new 3TB drive needs to be mapped as physical device (vmkfstools -z ... ) as VMFS does not support files larger than 2TB.

That sure does sound like a ESXi problem...

ESXi handle the 3TB drive just fine, but it seems like the VM with FreeNAS is not able to get the correct geometry/information.

diskinfo daX does not work - does not find the device.
camcontrol inquiry daX seems to works:
camcontrol inquiry da4
pass5: <ATA ST3000DM001-1CH1 CC26> Fixed Direct Access SCSI-5 device
pass5: Serial Number Z1F2KLYG
pass5: 6.600MB/s transfers (16bit), Command Queueing Enabled

I'm not very familiar with FreeBSD in general so I'm struggling here.
Is there a way I can check what the OS thinks the geometry is, and/or lock it to specific numbers?

The only time I've seen geometry errors(aside from the standard USB geometry error on FreeNAS bootup) is bad hard drives.

So big picture, the issue is with ESXi and how it virtualizes stuff and not FreeNAS/FreeBSD. And like many people, they'll assume its a FreeNAS issue. Then they'll go to something else(Windows, Linux, whatever) and the problem will magically go away further proving their (incorrect) assumption that FreeNAS is the problem. But correlation is not causation. And if you dig deep enough, you'll probably find something is not quite right with ESXi.

But at the end of the day it's your server and your data. I wish you luck in your endeavors!

There isn't anything blowing my hair back with ESXi and RDM. The issues are very complex, not solved by the average user, and for most people not obvious until they find they've screwed up and lost their zpool. I did play this ESXi game in Dec 2012. I spent 3 weeks experimenting iwth it trying to figure out if RDM and PCI Passthrough can work on ESXi. I gave up after 3 weeks because it became very obvious that RDM just won't work and my hardware didn't want to behave with PCI Passthrough. Not surprising considering my hardware wasn't purchased for ESXi and PCI passthrough in particular.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
No, I agree, RDM isn't the culprit, excessive complexity is. As much as I'm a fan of virtualization, I'm also a fan of not losing important data. Every layer you put between your FreeNAS VM and the actual physical block on disk is an additional complication. It also adds layers that ZFS isn't expecting to be there, and potentially eats away at some of the assumptions ZFS makes about storage integrity. Cyberjock has been around long enough to witness a bunch of these cases gone bad. I've been around long enough that I barely stop to think "wow, that sucks." The problem here lies with people who engineer systems that eat away at the underpinnings of a well-designed system and wind up compromising integrity.
 

oswa

Cadet
Joined
Apr 10, 2013
Messages
6
The real issue is that you had a knowledge gap you didn't know about. Nothing new on this forum, a lot of people don't know a lot of stuff they didn't realize they didn't know. That's the whole freakin' problem. People don't know what they don't know and don't want to do the required homework to figure out what they don't know.
I'm the first to acknowledge that there are a lot of things I don't know - this may be one of them.
That's kind of why I went looking for help/advice...

Never seen that error before.. except for people that use VMs. Still sure its a FreeNAS problem?

That sure does sound like a ESXi problem...
Pretty confident.
OpenSolaris/Nexenta has no problems neither has any Linux distribution I tried.
I only experience problems with FreeNAS/FreeBSD.
Of course, that doesn't mean that the others aren't working around a bug in ESXi, but it seems less likely that they do.

The only time I've seen geometry errors(aside from the standard USB geometry error on FreeNAS bootup) is bad hard drives.
That is not the case here. Disk works fine outside out the server, in the server and for other VM's (using the same device mapping). It's only an issue with ESXi+Physical RDM and FreeNAS.

So big picture, the issue is with ESXi and how it virtualizes stuff and not FreeNAS/FreeBSD. And like many people, they'll assume its a FreeNAS issue. Then they'll go to something else(Windows, Linux, whatever) and the problem will magically go away further proving their (incorrect) assumption that FreeNAS is the problem. But correlation is not causation. And if you dig deep enough, you'll probably find something is not quite right with ESXi.
I came here after confirming as much as I could that this wasn't a problem with other OS's - in the same config. Rather it seems pretty specific to FreeBSD in this particular scenario - physical RDM's.
This may be a ESXi bug/feature, but given that several other OS's work in the exact same config and are able to detect the geometry correctly it may just be that I am correct and there's something unique about the way FreeBSD/NAS read the geometry.

There isn't anything blowing my hair back with ESXi and RDM. The issues are very complex, not solved by the average user, and for most people not obvious until they find they've screwed up and lost their zpool. I did play this ESXi game in Dec 2012. I spent 3 weeks experimenting iwth it trying to figure out if RDM and PCI Passthrough can work on ESXi. I gave up after 3 weeks because it became very obvious that RDM just won't work and my hardware didn't want to behave with PCI Passthrough. Not surprising considering my hardware wasn't purchased for ESXi and PCI passthrough in particular.
Sorry you spent so long and where not able to get it to work.
My setup worked very well with minimum effort setting up.
That is until last week when I lost a drive and decided that I wanted to expand capacity rather than do a direct replacement.
I am confident that I could have replaced my drive with a 2TB drive without any issues as I could have stayed with the old RDM setup.

As it is now I will just drop this whole thing.
It's not worth it for me to spend time on trying to fixing something that isn't considered broken by others.
I will retire the FreeNAS setup as soon as I'm done migrating the data to a different VM running Nexenta on the new drives.

- - - Updated - - -

cyberjock, thank you for taking the time to reply!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
No, I agree, RDM isn't the culprit, excessive complexity is. As much as I'm a fan of virtualization, I'm also a fan of not losing important data. Every layer you put between your FreeNAS VM and the actual physical block on disk is an additional complication. It also adds layers that ZFS isn't expecting to be there, and potentially eats away at some of the assumptions ZFS makes about storage integrity. Cyberjock has been around long enough to witness a bunch of these cases gone bad. I've been around long enough that I barely stop to think "wow, that sucks." The problem here lies with people who engineer systems that eat away at the underpinnings of a well-designed system and wind up compromising integrity.

I wasn't putting the blame at RDM specifically, just something with ESXi and virtualization in general. Virtualization adds an additional layer of stuff that can go wrong. Even if PCI Passthrough was used, there's no guarantee that this problem wouldn't be occurring right now. Just look at my failed experiment thread from Dec 2012. When stuff goes wrong as an end user its virtually impossible to point the blame at exactly what is wrong.

@ oswa-

Don't take my comments as me insulting you. Especially the part about people not knowing what they don't know. We're all screwed in "that" department. It's really hard to determine when you've gotten to the bottom of the rabbit hole. I've seem smart people make dumb mistakes, and dumb people do amazingly smart stuff. The forums are the best place to go looking for what you don't know. The problem is that even if I tell you how it really is, what says that I'm actually right? Some people argue that RAID is a backup, others don't. Some people don't even consider a full copy of your data in an adjacent building as a "backup" but merely a "local copy". This stuff is far from trivial to explain, to understand, and to research. I have lots of free time, and I love to read about new technical stuff for fun.

Personally, I'd really like to understand more about what is going wrong with ESXi in many cases(you're one of them) but I'm doubtful I'd ever really know. ESXi isn't a trivial piece of software to understand, and without source code it would probably be impossible to properly identify.
 

oswa

Cadet
Joined
Apr 10, 2013
Messages
6
@ oswa-

Don't take my comments as me insulting you. Especially the part about people not knowing what they don't know. We're all screwed in "that" department. It's really hard to determine when you've gotten to the bottom of the rabbit hole. I've seem smart people make dumb mistakes, and dumb people do amazingly smart stuff. The forums are the best place to go looking for what you don't know. The problem is that even if I tell you how it really is, what says that I'm actually right? Some people argue that RAID is a backup, others don't. Some people don't even consider a full copy of your data in an adjacent building as a "backup" but merely a "local copy". This stuff is far from trivial to explain, to understand, and to research. I have lots of free time, and I love to read about new technical stuff for fun.
Understood, and I can't help but smile in agreement...
I've earned my living as a consultant/dev/technologist in the enterprise storage and high availability business for the last 15 years.
One could say that, in general, I have a pretty good understanding for how things work, but also interoperate.
Unfortunately I don't have a lot free time anymore so troubleshooting at a deeper level won't happen.
Hence taking the easy way out in this case.

Personally, I'd really like to understand more about what is going wrong with ESXi in many cases(you're one of them) but I'm doubtful I'd ever really know. ESXi isn't a trivial piece of software to understand, and without source code it would probably be impossible to properly identify.
Agreed, you won't get very far and unfortunately VMWare likes to keep it that way. They play very close to the chest with ESXi.
 

ClaytonRoss

Cadet
Joined
Feb 2, 2014
Messages
2
Hello,
I hate to reply to this thread , but alas I have the same problem.
I have four ,2TB drives in my HP Gen8 Micro Server, EXSI boots off a microSD card, There is a 180Gb SSD for DataStores for the VMs.
I have made a RDM for each drive, using CMD vmkfstools -z
First i used the t10.ATA....name then i did the command with the vml.010000...Name.
I both cases the RDM show up and work in CentOS and Windows XP64 But in FreeNAS show UP as Size Zero!!
I have try ed partioning the drives with MBR , GPT , and formatting with UFS. No luck FreeNAS cant see it.

I am out of ideas , And would love some help. I really want to get this going. I love the way RDM works, it very nice to be able to pull the drive and stick it in a stand alone box and it show up.
and this is really the only way i would want to run FreeNAS on my Server. I know its not best practace but i have one box with 16GB of ram and a 4 core 40watt xeon and It can and should bealbe to run all my servers
I really want to consolidate.
 

ClaytonRoss

Cadet
Joined
Feb 2, 2014
Messages
2
BTW , I tryed to load PFsense on one of these RDM drives and it worked fine. So i dont know what the deal is , but its not just BSD as the pfsence i used runs 8.3 P11
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm really not even sure why you necro'd this thread. We've already discussed this to death, we've said all there was to say in April 2013, then we even created a sticky that politely said "you're a damn fool if you virtualize FreeNAS". If you aren't happy with the virtualization, you should go to ESXi. That's their problem. We never have these problems on real metal servers, so clearly the virtualization is to blame somehow.

Generally, if you complain that virtualization isn't working, you get no answer. Consider it a privledge that anyone even answered you... then stop using virtualization!!!!
 

pirateghost

Unintelligible Geek
Joined
Feb 29, 2012
Messages
4,219
And you felt the need to necro the thread to say that?

Locking this
 
Status
Not open for further replies.
Top