esxi 5.5, qlogic hba freebsd rdm drama in my jungle

Status
Not open for further replies.

newbstarr

Cadet
Joined
Feb 5, 2015
Messages
4
My lego:
dell t110, xeon 2.4, 16gb ram, 3 local 250gb sata volumes in a local host raid 1 with hot spare (dont think they are sas) running vmware 5.5
in the x8 slot, hba qlogic qle2464 -> NetApp ds14 mk2 AT (dual controllers) directly, no filer (head) and no mpio configured (yet). loaded with 14x 1tb drives. Single loop connected

Vmware wont passthrough my HBA, so RDM (expletives), aparently only the 8gb 25xx serious and some higher 16gb cards are compatible with pass through. makes very little sense, so I tried to arbitrarily config the passthrough file with a def to allow my 4gb cards through, shot in the dark and it failed.

If you have a suggestion on a better coarse of action, please provide your feedback, I'm definitely open to a better solution with the following restrictions.

I'm done on my budget for lego for a while, so purchasing another box to go bare metal controller on my fc 'san' draw isnt an option, since there is no money for another box, there surely isnt any money for a filer head from netapp.

My configuration is such that nas handles a zfs raid 6 style volume presenting a few iscsi data sets.

My plan is to have the nas boot off of the local disks on the vmware host, come up, bring up the clustered storage and zfs which would hten bring up the other vm's and storage.

The current problem other than it takes an eon to boot the san and vmware and have them play nicely together is that when writing to the san, I get wonderful burst of quite brilliant speeds, the speeds I expect, however I something resets and the whole process pauses stops mid transfer, then bursts again for a short period of sustained transfer before dieing again.

The errors reported are not unheard of, but I'm not sure how to resolve them:
WRITE(6) / READ(10)
CAM status: SCSI Status Error
SCSI status: Busy
Retrying command

which would seem to be bsd recieving commands from the hd's and not handling them properly though that could be a red herring, those are normal commands from a controller? Why would that cause a drop/pause in transmittion.

In any case, I can live with the poor startup if I can get my storage working sufficently efficiently. Ideas?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If you have a suggestion on a better coarse of action, please provide your feedback, I'm definitely open to a better solution with the following restrictions.

I'm done on my budget for lego for a while, so purchasing another box to go bare metal controller on my fc 'san' draw isnt an option, since there is no money for another box, there surely isnt any money for a filer head from netapp.

Well, I think you just shot yourself in the face then. Because the only good options I have (even if I were to endorse virtualizing FreeNAS, which I don't) would all require hardware changes. So I think you really said something like "Tell me how to fix it, but don't tell me my crappy hardware is... crappy." Ok. So I won't tell you your hardware is crappy. I'll just say that proper hardware would work properly and let you fill in the details. :)
 

newbstarr

Cadet
Joined
Feb 5, 2015
Messages
4
So you could of said, please don't virtualise freenas, I (you in this case) can't see a good way of using my hardware in a sane way to do a thing.

Sure the proper hardware would do the things and having the funds to do all the things is always great but in this case I have to work with what I've got and what I've got is so close to working in the way it was designed its insane. I should of mentioned I've read (i think your post here and in a freebsd bug report that virtualising freenas is a bad idea and much pain and loss of data to follow) that you shouldn't virtualise freenas but frankly it should work and this is not production, it is my home setup, so aslong as my drives and draw dont die I should be able to read my data with anything I can stick a hba in. I know and understand some of the risks. I've tried to do my due diligence in terms of research.

What I was hoping for was someone who had seen freebsd unix errors like this and worked out how to deal with it, most likely in this environment. I think you have definetly seen these errors because you've said in other posts you do perform testing like this from time to time. I suppose I should go to some forum with people who deal with freebsd but I was hoping this was something someone here ( most hopefully you ) had seen before and had worked out how to tell the OS not to freak out on the hd firmware response instructions that its supposed to understand. I realise that this is not the convention for OS's to handle waits and pauses anymore with buffers but vmware doesnt buffer. Weird in my mind but probably an efficiency change. I'm getting false positive hints the hd's are going bad because they are saying wait, buffer full to the OS which then post errors about handling it. The OS does handle it and I get fantastic speed for a few minutes until another happens then I get a pause, reset and again. What I want is to have the OS swallow these exceptions gracefully and keep on rolling because the pause is human measurable and the pause should be invisible to human perception.

TLDR: Essentially my problem is freebsd doesnt handle all the dammed controller commands, it doesnt deal with all the HD's instructions properly.

Thanks for trying even if overly sarcastic.
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
BUSY SCSI status is what reported by virtual VMware HBAs when it has problems connecting to the storage. In such cases FreeBSD/FreeNAS do exactly what asked -- wait and try again. Surely if that happen too often -- performance will heavily suffer. So I believe that is VMware who should be asked about this problem. All FreeNAS can do in such case is wait. Just few days ago I committed patch to make it wait indefinitely, if asked by VMware's virtual disks, without returning errors to applications, only bugging admin. So you may want to update, just in case.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
[...] 16gb ram[...] 14x 1tb drives.

So you barely have enough RAM to run FreeNAS for a normal filer - the rule of thumb is 1GB per TB for normal use. But you want to do VM storage on it? The rule of thumb for THAT is 2GB per TB at the low end. So you need to add more memory to the FreeNAS config, probably need 32GB. And then you want to run ESXi and have other VM's? Looks like you need to have maybe 48GB of memory, but that E3 platform only supports 32GB. So bump the memory up and ditch ESXi.

I'm pretty sure that under-resourcing FreeNAS is a bad idea.

https://forums.freenas.org/index.ph...nas-in-production-as-a-virtual-machine.12484/

(yup)

My configuration is such that nas handles a zfs raid 6 style volume presenting a few iscsi data sets.

That's going to be very low performance.

My plan is to have the nas boot off of the local disks on the vmware host, come up, bring up the clustered storage and zfs which would hten bring up the other vm's and storage.

I'm pretty sure this is advised against.

https://forums.freenas.org/index.ph...nas-in-production-as-a-virtual-machine.12484/

(yup)

The current problem other than it takes an eon to boot the san and vmware and have them play nicely together is that when writing to the san,

I'm pretty sure this is warned about

https://forums.freenas.org/index.ph...nas-in-production-as-a-virtual-machine.12484/

(yup)

I get wonderful burst of quite brilliant speeds, the speeds I expect, however I something resets and the whole process pauses stops mid transfer, then bursts again for a short period of sustained transfer before dieing again.

The errors reported are not unheard of, but I'm not sure how to resolve them:
WRITE(6) / READ(10)
CAM status: SCSI Status Error
SCSI status: Busy
Retrying command

which would seem to be bsd recieving commands from the hd's and not handling them properly though that could be a red herring, those are normal commands from a controller? Why would that cause a drop/pause in transmittion.

Slow-ass storage? Just a theory. You have a system (FreeNAS) with a lot of I/O potential in front of a poorly designed I/O subsystem and this is just one of many reasons why this kind of thing is a bad idea, even if you can get it to kinda-work sometimes-mostly.

In any case, I can live with the poor startup if I can get my storage working sufficently efficiently. Ideas?

Yeah. Since you already said you don't want the wise advice, try mav@'s patch, then you can smash your head against a multidimensional storage problem when ESXi sees its iSCSI storage acting wonky. You will then have your guest VM's randomly getting I/O hangs that may be due to FreeNAS pausing which is in turn due to your underlying I/O subsystem sucking which will lead to loss of hair when you try to figure out what else to twiddle.

virtualising freenas is a bad idea and much pain and loss of data to follow) that you shouldn't virtualise freenas but frankly it should work and this is not production, it is my home setup, so aslong as my drives and draw dont die

A production environment is generally an environment where software is put into use in its intended purpose. It is the opposite of a testing environment, where failures are acceptable and even expected. A "home setup" is still a production environment unless I can come in the middle of some random night, dban all your drives, and the entirety of your response is "meh."

So some clues:

You Are Not The First Person To Have The Idea Of A Beautiful All-In-One ESXi Box.
Your Hardware Is Insufficient.
Most People Who Try This With Random Hardware Fail And Fail Hard.
Your Hardware Is Not Well Chosen.
Most People Who Have Had Success With This Have Followed My Guides Or Are Sufficiently VMGods To Know Why They Can Bend My Rules.
Your Hardware Sucks.
VM Storage On Anything Other Than Mirror VDevs And A Heavily Resourced FreeNAS Box Is Like Asking For Someone To Pound Nails Into Your Head.
Your Hardware Is Bad.
When You Soldier On And You Finally Realize Life Sucks Doing This, Just Remember, "We Told You So."
 

newbstarr

Cadet
Joined
Feb 5, 2015
Messages
4
Thankyou all for taking the time to respond. If this frustrates you, please do not support my wildly hopeful assumptions. I realise I am not the first person to try this and you have literally written long posts about why this is a bad idea. I understand this is fraught with issues and you told me so in your posts I had read prior to you actually telling me. We just divided by zero and the world didn't explode.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
We like wildly hopeful, but storage is a pragmatic business. Few things are more frustrating than trusting your storage platform and then having it blow up in your face, taking valuable data and lots of effort with it.

ZFS and FreeNAS are somewhat more difficult than other options might be, because they are designed to burn resources to get you awesomeness. That is at odds with running it on your typical virtualization platform, which is trying to share resources to make effective use of limited resources.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
So since safety and hardware have been addressed. ;)

As an experiment, I'd mix up the layers. Can you get esxi to play very nicely with that qlogic controller? If so, you have the option of running FreeNAS on redundant datastores, ala one of jgreco's other threads https://forums.freenas.org/index.ph...ative-for-those-seeking-virtualization.26095/

The bottom line is you need all the IOPS you can get for your VM's and you have to sacrifice space for speed. But if you set up underlying redundant storage, your FreeNAS VM can be fed a bunch of disks that you can configure as mirrors... Then test and hope for the best. You might be able to work around your timeouts and stalls. Dunno, but I'd try it.

The all-in-one thing is what it is. There are lots of rants around on that. You haven't specified a use case... so who knows if it's appropriate. It also sounds like you only have one esxi box to feed, so stick a few ssd's in it for the VM's that need high IOPS. At least then you may be able to live with middling performance from zfs. You have to spend a crap-ton on ZFS (plus 10Gbe) to touch local ssd speeds.

The last question I'd ask is why bother with FN at all in this scenario? If putting the old gear to work is the primary goal, you want to utilize systems that support it best. That could be esxi, linux, omnios, bsd, even windows. But we know FC is sort of an unsupported area on FreeNAS and virtualizing it even less supported.

Good luck. We all like an interesting experiment or two. I'd play with those lego's and see what they can do. It should never make it past the disposable data stage, but that doesn't mean it isn't useful or fun. If you're trying to do REAL work and store the kids photos, backup your network, etc... it's probably gonna cost more money.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Thanks for trying even if overly sarcastic.

I wasn't being sarcastic. I was 100% serious, and hopefully you've realized that since the two highest posters in the forum have told you that this is a "no-go" with your hardware you will take it to heart. We can't give you the answer you want because you don't like the real answer.

That being said, I'm locking this thread before we have "yet another virtualization flamewar". Not all hardware can do everything. Yours is not capable of doing what you ask. Sorry if this is inconvenient, but it is also the truth.

Thanks.
 
Status
Not open for further replies.
Top