My Dream System (I think)

Status
Not open for further replies.

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Obviously, with a RAID controller, the system will boot regardless of whether one of the SSD's dies, and I believe you can also have a spare configured even on the IR cards, so you can put three SSD's in, two active, one standby, and have it magically and automatically replace a failed unit. And will continue to boot without any interruption.

For whatever it is worth, you might also consider trawling around fleabay for an LSI 9260 or one of the OEM's along with the BBU or supercap unit. Those are more expensive.

I too became interested in investing (for the continuation of overkilling visavi real needs) in a datastore raid solution.
This is probably a quite stupid question - does ESXi have a proper way to interact with these controllers to inform about drive failures?
I'd be interested in a quick description of the 'overall steps' from 'drive failure' -> 'notification' -> ... -> 'replacement of drive' -> 'back to normal redundant functionality' ?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Yeah, that's a great question.

The answer is different between Free ESXi and vSphere. For vSphere, you can set it up to notify you (thru vSphere's capabilities). Free ESXi, no such luck.

So the big thing is that even with a typical Windows or FreeBSD server, LSI's strategy for RAID card management tends to revolve around having some second system laying around that runs "MegaRAID Storage Manager" (hereinafter, "MRSM") which allows monitoring of a network of hosts with RAID cards. Normally these things also have an agent on the host that can perform some rudimentary notification operations, but this is not the case on ESXi (whether free or paid). ESXi rides the LSI short bus, which means that there's no LSI-provided mechanism to monitor the damn things, and even if you're running MRSM it may lie and report that the status is "OK" unless you actually log in with MRSM and poke around.

It's frickin' exasperating.

But ESXi does show you hardware health. And it will properly read out status assuming you've installed the LSI drivers, which I'll summarize shortly. Here's an example from a pair of boxes I installed yesterday for a client. I sent them a pair of LSI 9280's with marginal batteries, but since the write cache isn't super-important in the application, we decided to leave 'em as is.

This image shows an LSI 9280-4i4e that's almost healthy. The battery is in "partially charged" state. The volumes are in OPTIMAL state.

But this is the one I want to show you:

lsi-9280-marginal.png


So here, the battery status is Unknown. I'm actually not clear on why this unit is picking it up that way whereas the other isn't. Both of the batteries are about equally dodgy.

Also, the LSI batteries are known to be rather dodgy and have a limited lifetime. If you can, avoid BBU units and instead go with the supercap based units. They're better in a number of ways.

In any case, the thing to note here is that the health status of the battery has caused the health status of "Storage" to also be listed as Unknown. If a drive had failed, this would also be showing here.

So the trick is that you have to install all the LSI stuff. It isn't super rocket science but obtaining the right stuff is a bit of an art form. You need to install on the ESXi host:

esxcli software vib install -d megaraid_sas-6.605-00.00-offline_bundle-2132901.zip
esxcli software vib install -d VMW-ESX-5.5.0-lsiprovider-500.04.V0.59-0004.offline_bundle-3663115.zip
esxcli software vib install -v vmware-esx-storcli-1.19.04.vib --no-sig-check

Plus for MRSM to work you pretty much have to disable the ESXi firewall.

esxcli network firewall set --enabled false

That last vib is actually the linchpin most people use for monitoring RAID health on Free ESXi. It has the LSI StorCLI tool as /opt/lsi/storcli/storcli which allows you to make virtually any change to the MegaRAID controller. For example,

/opt/lsi/storcli/storcli /c0 /vall show

Shows the state of the virtual drives on the first RAID controller. Smart monkeys will take this and either have a remote host monitor their ESXi once an hour by ssh'ing in and parsing the output, or will put a cron script on the ESXi host (greater difficulty level). Actually managing an array with StorCLI is a nightmare of arcane syntax invented by some masochistic hardware driver author who had no idea what "ease of use" means.

Fortunately for us, in almost all cases, replacing a failed drive is a matter of removing the failed disk and replacing it with an acceptable replacement disk. By default, an LSI controller will assume that's a disk replacement and will start a rebuild. ESXi is not involved at all in this process. It's all the RAID controller and its onboard software. If you do need to poke at the RAID controller for some reason, then StorCLI on ESXi comes in handy for the task, or for a graphical version, MRSM from another host on the network.

I deem it a crappy overall solution, especially VMware's deficient options for alerting on Free ESXi, but one that can be made to work if you know what I've outlined here and you take the time to instrument your systems appropriately.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
First, Thanks to @Dice for asking the question becasue it was not obvious to me that there wouldn't have been some sort of notification.

Now for my question...
So the trick is that you have to install all the LSI stuff. It isn't super rocket science but obtaining the right stuff is a bit of an art form. You need to install on the ESXi host:

esxcli software vib install -d megaraid_sas-6.605-00.00-offline_bundle-2132901.zip
esxcli software vib install -d VMW-ESX-5.5.0-lsiprovider-500.04.V0.59-0004.offline_bundle-3663115.zip
esxcli software vib install -v vmware-esx-storcli-1.19.04.vib --no-sig-check

Plus for MRSM to work you pretty much have to disable the ESXi firewall.

esxcli network firewall set --enabled false
Since I'm using ESXi 6.0U2, are these files still good to use?

I have found something like this for example "VMware ESXi 6.0 scsi-megaraid-sas 6.611.03.00-1OEM SAS Driver for Avago Megaraid 9361 Based Adapters" and not sure if something built for ESXi 6 should be used or if the 5.5 stuff is fine.

As for having an email notification, that would be nice so once I can see my adapter status, I'll worry about the email notification.

Here is a screen shot of my Health screen. You will note that I have a fan failure. Well I unplugged a case fan while the system was running.
Capture.JPG
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
First, Thanks to @Dice for asking the question becasue it was not obvious to me that there wouldn't have been some sort of notification.

There is, it's just crappy notification. VMware's strategy with Free ESXi is to provide you with the virtualization functionality but basically no other significant bells or whistles.

Now for my question...

Since I'm using ESXi 6.0U2, are these files still good to use?

Wrong question. They're compatible with 6.0U2, yes. The driver framework in 5.5 is compatible with 6, so any driver that's made for 5.5 is also supposed to work on 6.

But the megaraid-sas is for the MegaRAID stuff. I'm pretty sure that there's a different driver used for the IR stuff but I've got a great big pile of #whocares this morning. So I'm going to say that you still need to install the lsiprovider stuff (gives you health monitoring from ESXi and the C client) and the storcli stuff (gives you the ability to twiddle as shown above).

Also, when downloading those files, they probably come from a variety of places, because anything involving ESXi is usually made as difficult as possible by the vendors, which maybe is good for those of us in IT, who make our money by knowing arcane crap, but I still think it really sucks. I download and cache everything so that it never becomes an issue later to re-find something after maybe 3Ware gets sold to LSI gets sold to Avago gets sold to Broadcom.

I have found something like this for example "VMware ESXi 6.0 scsi-megaraid-sas 6.611.03.00-1OEM SAS Driver for Avago Megaraid 9361 Based Adapters" and not sure if something built for ESXi 6 should be used or if the 5.5 stuff is fine.

As for having an email notification, that would be nice so once I can see my adapter status, I'll worry about the email notification.

Here is a screen shot of my Health screen. You will note that I have a fan failure. Well I unplugged a case fan while the system was running.
View attachment 12540

Yeah okay so you don't have the lsiprovider stuff installed. Install it and reboot. Takes a few minutes to populate, so don't freak out if it doesn't show you stuff right away.

All the arcane driver and CIM stuff is the ESXi equivalent of all those fun-but-nonobvious things we have to tell users about FreeNAS, like IT crossflashing, specific phases, SMART testing on drives, etc.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I've got a great big pile of #whocares this morning.
Well it is Independence Day so maybe you should go celebrate and take the day off from these forums or whoever is getting under your skin (sorry, relatives I can't help you with). Alcohol and Fireworks, it's a great mix. I'm in favor of doing the BBQ today before the storms come in. Gonna be a bad day. Most of the fireworks in the area were shot off Saturday and Sunday because of the expected weather. I'll be trolling here periodically today.

And I'll try to install those VIB files and see where it takes me. I would at least like to get a status to show up on the screen.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well it is Independence Day so maybe you should go celebrate

"Today is the Fourth! It's practically here!"
Then he growled, with his Grinch fingers nervously drumming,
"I MUST find some way to stop this holiday from coming!"

And then! Oh, the noise! Oh, the Noise!
Noise! Noise! Noise!
That's one thing he hated! The NOISE!
NOISE! NOISE! NOISE!
Then the Whos, young and old, would sit down to a feast.
And they'd feast! And they'd feast! And they'd FEAST!
FEAST! FEAST! FEAST!
They would feast on Who-pudding, and rare Who-roast beast.
Which was something the Grinch couldn't stand in the least!
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
I am watching the neighborhood 4th of July parade.


Sent from my iPhone using Tapatalk
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I'm watching the rain. I can BBQ in the rain.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
"Today is the Fourth! It's practically here!"
Then he growled, with his Grinch fingers nervously drumming,
"I MUST find some way to stop this holiday from coming!"

And then! Oh, the noise! Oh, the Noise!
Noise! Noise! Noise!
That's one thing he hated! The NOISE!
NOISE! NOISE! NOISE!
Then the Whos, young and old, would sit down to a feast.
And they'd feast! And they'd feast! And they'd FEAST!
FEAST! FEAST! FEAST!
They would feast on Who-pudding, and rare Who-roast beast.
Which was something the Grinch couldn't stand in the least!
I should edit your avatar again and have him holding an American Flag or something like that. You're such a Grinch during the holidays.
 
Joined
Nov 11, 2014
Messages
1,174
The SSD's would actually be a RAID1 datastore, they'd just also happen to have an ESXi boot partition on them. Either drive failing results in things still working. Plus you can then insert a replacement drive and it'll rebuild, no downtime.

I was reading what @joeschmuck said and was thinking to do the same thing:
Make first mirror with 2 small SSDs (let say 80GB in raid1) for the esxi boot partition and then make second bigger mirror (like 2xSSD 512GB in raid1) for the VM's. Do you think this will be better from performance and management stand point better than just make the second mirror with bigger SSD's and then esxi boot partition and VM's share the same datastore ?

P.S. From Windows and this applies for Freenas as well , We like to have the OS in one place and then the DATA in another, but I don't know if this practice will be same with ESXI or this will just waste 2 ports and 2 small SSDs ?
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
@Black Ninja sounds like pretty much overkill.

I think it is a shame there is so little interest in having FreeNAS sharing space for the VM's.
If a Virtualized FreeNAS can be sufficiently fast to serve other EXSi hosts, I fail to see how it wouldnt be suitable once removing the component of copper cables and switches from the equation.
Especially since the support for hardware raid on free ESXi appears to be all but easy, intuitive and straight forward.

There are some caveats, yet IMO rather insignificant in comparison: some performance penalty, and some old reports of ESXi failing to scan for datastores that are not availble on boot - if you run into problems I've encountered (but not tested) workarounds (@Spearfoot did you need to pull any workaround?)

To recap the config:
Install ESXi on a small SSD. Use as boot and datastore.
Install FreeNAS VM on that datatstore.
On FreeNAS, create a VM pool. Mirror vdev of the SSDs (or ...just skip this step)
Create a dataset, share it as iSCSI (with all its glorious complete lack of intuitive configuration pattern - there are guides!)
On ESXi, configure FreeNAS to boot first. Add delay of a few mins before other VM's are set to boot. Conversely, assure the FreeNAS VM is shut down last with some margin for other VM's to shut down.
Profit.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
this will just waste 2 ports and 2 small SSDs

Answered it yourself. :smile:

The typical ESXi box is not making heavy use of its boot datastore once booted. And you may not even want it to be SSD; it is more-difficult to tag a SSD datastore as being SSD if it is mounted. And of course it'd be hard to unmount the boot datastore while you're trying to make configuration changes that have to be recorded on it.

So here, we run with the basic idea of having a HDD RAID1 datastore for ESXi boot, on something like a pair of WD Red 2.5" 1TB NAS HDD with a spare. It serves a secondary purpose of being a completely acceptable 1TB datastore for non-intensive purposes (DHCP server VM's, DNS server VM's, authentication VM's, maybe UTM devices, etc). I like to save the SSD for stuff that actually benefits from it.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I think it is a shame there is so little interest in having FreeNAS sharing space for the VM's.
If a Virtualized FreeNAS can be sufficiently fast to serve other EXSi hosts, I fail to see how it wouldnt be suitable once removing the component of copper cables and switches from the equation.
Especially since the support for hardware raid on free ESXi appears to be all but easy, intuitive and straight forward.

It isn't that there's "so little interest." It's rather that it just doesn't work as well as you'd think. The average AIO user was coming in here with their 32GB E3 based system and hoping to get away with using only 8GB of RAM to drive ESXi over NFS or iSCSI on RAIDZ2. THAT just doesn't work out. I mean, yes, you can make it "work" for values of the word that include "I got a VM to install and boot on my virtualized FreeNAS VM" but it doesn't "work" where "work" means "I can use this as a useful VM datastore."

If you're willing to dedicate at least two cores and 32GB of RAM to it, and you're willing to run mirror vdevs, then that changes things. Until recently this required Xeon E5 or similar.

By comparison, just getting a hardware RAID controller and letting it do its thing can be a lower-cost route to datastore happiness. The average price of the LSI 9260 type units on eBay has fallen to ~$200-$300, and there's even a strong argument for an AIO box to just grab a low end IR controller and RAID1 some inexpensive SSD and have a relatively low cost datastore that FreeNAS just can't compete with. The proper configuration of hardware RAID isn't actually that bad, but I'll absolutely grant that it isn't as easy as it ought to be. VMware isn't that interested in making it *easy* because in their view it is the RAID manufacturer's problem, and the RAID manufacturers, they've all made it suck.

I recently posted a summary of what's needed to get a fully functional LSI RAID install; https://forums.freenas.org/index.php?threads/my-dream-system-i-think.41072/page-18#post-298778
Plus anyone here who wanted some help working through the LSI side of it, I might take pity on you :smile:

There are some caveats, yet IMO rather insignificant in comparison: some performance penalty, and some old reports of ESXi failing to scan for datastores that are not availble on boot - if you run into problems I've encountered (but not tested) workarounds (@Spearfoot did you need to pull any workaround?)

To recap the config:
Install ESXi on a small SSD. Use as boot and datastore.
Install FreeNAS VM on that datatstore.
On FreeNAS, create a VM pool. Mirror vdev of the SSDs (or ...just skip this step)
Create a dataset, share it as iSCSI (with all its glorious complete lack of intuitive configuration pattern - there are guides!)
On ESXi, configure FreeNAS to boot first. Add delay of a few mins before other VM's are set to boot. Conversely, assure the FreeNAS VM is shut down last with some margin for other VM's to shut down.
Profit.

Yeah, the "add delay of a few mins" thing seems like it "ought to" work, but still doesn't do so reliably in practice, because ESXi scans for its datastores at boot time, and if it doesn't find them, it doesn't necessarily try again. With software defined networking and storage now being a thing, there are hooks in ESXi that apparently you can hack into to make this work (more-) correctly, I seem to recall it involves the "Agent VM" framework.

The more traditional route is to have FreeNAS ssh in to your ESXi host when it boots and issue "esxcli storage core adapter rescan --all" to have ESXi rescan for iSCSI devices. That has the advantage of actually working but you're still guessing as to the timing of things. I'm pretty sure it no longer screws up if you do manage to get all your ducks in a row.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
but I don't know if this practice will be same with ESXI or this will just waste 2 ports and 2 small SSDs ?
Nothing is faster but all I gain is a the possibility if one of the boot drives fails, it should be transparent to my family which is the sole reason I wanted it. While I'm on travel I need the ability to keep the ESXi server running all the time without issue. This is because my internet connection is managed though the ESXi box using Sophos and in ESXi the boot drive is also a datastore when not using a USB Flash drive so I also have Sophos on this redundant storage. I do have a backup router that can be connected but it's such a hassle telling the non-technical person how to do that.

So what I've done suits my purposes perfectly however it may not be the best use of the hardware for everyone.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
There are some caveats, yet IMO rather insignificant in comparison: some performance penalty, and some old reports of ESXi failing to scan for datastores that are not availble on boot - if you run into problems I've encountered (but not tested) workarounds (@Spearfoot did you need to pull any workaround?)
The only workaround I had to use was described above by @jgreco, and it works quite reliably:
jgreco said:
...have FreeNAS ssh in to your ESXi host when it boots and issue "esxcli storage core adapter rescan --all" to have ESXi rescan for iSCSI devices.
I also agree with the Grinch that a dual-CPU server w/ 32+GB RAM allocated to the FreeNAS VM is a minimal entry-level virtualized filer for real work; i.e., multiple users running multiple VMs in a production environment.

My use-case is different; I'm a developer and my ESXi-plus-FreeNAS AIO is my essential developer's Swiss Army Knife. I don't regularly run any VMs except for FreeNAS, but I have VMs set up to run several versions of FreeBSD, Linux, Windows w/ Visual Studio, Oracle databases, etc., depending on what project I'm working on. So a single-CPU server with only 16GB of RAM dedicated to FreeNAS suits my needs perfectly.
 
Joined
Nov 11, 2014
Messages
1,174
So what I've done suits my purposes perfectly however it may not be the best use of the hardware for everyone.

Oh I am 100% with you Joe on this one. The idea of having redundancy on the esxi datastore with raid controller it's been on my head for some time. It's seems you made even more people considering the idea which is great actually.

I am with you on the "family" issue too. It's much better to make this work redundantly so they don't have interruptions with internet, instead of torture them with instructions to follow.:smile:

P.S. It's funny when internet stops, first thing is they blame you: " Are you doing something again with the internet ?" , and if it turns you have nothing to do with it , then they they need you : "...then go see what is wrong with it !":)
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
P.S. It's funny when internet stops, first thing is they blame you: " Are you doing something again with the internet ?" , and if it turns you have nothing to do with it , then they they need you : "...then go see what is wrong with it !":)
LOL, that is exactly the way it is here at my house too. To be honest, my ISP has been very reliable for the past 5+ years and dropping service a hand full of times.
 
Joined
Nov 11, 2014
Messages
1,174
LOL, that is exactly the way it is here at my house too. To be honest, my ISP has been very reliable for the past 5+ years and dropping service a hand full of times.

With the risk of making me jealous I must ask who is your provider that is so reliable ?:smile:
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Metrocast Cable. It use to be really crappy and unreliable before I joined them. I waited until I started hearing better things about them. They are the fastest ISP in town too, although I don't subscribe to the fastest speed. The pricing is a bit high for my tastes and so long as I can stream 2 Netflix movies and surf the internet at the same time, all is good. The only ISP options I have are Metrocast Cable and Verizon DSL. If Verizon FIOS were an option, I'd likely jump ship.
 
Status
Not open for further replies.
Top