BUILD ESXi Home Server + FreeNAS

Status
Not open for further replies.
Joined
Nov 11, 2014
Messages
1,174
I agree....I know I've been told quite a few times at work by the "younger" crowd that I overthink things alot. And yet, I haven't been the one responsible for a system outtage at work in over a decade.

I am so much with you on this one. I rather trust people who overthink compared to people who don't think at all.
 

RichTJ99

Patron
Joined
Sep 12, 2013
Messages
384
For the Esxi experts a question - I have freenas running as a VM. This Esxi box has about 6 other VM's running - everything seems to be fine except the transfer rates from a CIFS share on the VM freenas to a physical box are significantly slower than my physical box.

I was thinking that maybe its to much data coming out of the one ethernet port.

I have a Super Micro X11 board with 4 ethernet ports. How can I tell which port is currently used?

http://i.imgur.com/xZvqmfE.jpg

I would like to pass through a port (assuming these are in pairs) to the Freenas VM. I figure it wont hurt to try it. I just dont want to pass over my current ethernet port for the Esxi box to freenas & lock myself out.

Thanks,
Rich
 
Joined
Nov 11, 2014
Messages
1,174
For the Esxi experts a question - I have freenas running as a VM. This Esxi box has about 6 other VM's running - everything seems to be fine except the transfer rates from a CIFS share on the VM freenas to a physical box are significantly slower than my physical box.

I was thinking that maybe its to much data coming out of the one ethernet port.

I have a Super Micro X11 board with 4 ethernet ports. How can I tell which port is currently used?

http://i.imgur.com/xZvqmfE.jpg

I would like to pass through a port (assuming these are in pairs) to the Freenas VM. I figure it wont hurt to try it. I just dont want to pass over my current ethernet port for the Esxi box to freenas & lock myself out.

Thanks,
Rich

Being slower in VM compared is physical is normal , it would be un-normal if it was the other way around .How much slow ? In which direction are slower ? What type of VM nic you used , how many other used the same physical port in which the nic was assign to ?
 

RichTJ99

Patron
Joined
Sep 12, 2013
Messages
384
The slowdown has to do with network transfer speeds. I am surpised how little the VM uses for Rsync resources. The vm is giving me about 10mb speed for rsync, 30mb for CIFS - very slow as everything is 1gb on my network.

All my VM's are using only one nic. I guess I should assign more Nics to various VM's?

I should also say that all other VM's are mostly idle. My biggest data user is the windows Veeam box but that backs up VM's in the middle of the night.

EDIT: I didnt install VMware tools so I think I am using the E1000 nic by default?

EDIT2: I have a X11SSL-CF-O - it has 2 Nics - not 1. I was confused as my PFsense box has 4 nics.
 
Last edited:
Joined
Nov 11, 2014
Messages
1,174
The slowdown has to do with network transfer speeds. I am surpised how little the VM uses for Rsync resources. The vm is giving me about 10mb speed for rsync, 30mb for CIFS - very slow as everything is 1gb on my network.

Keeps things simple and compare apples to apples. Don't involve rsync for speed benchmarking. You put too many variables in order to troubleshoot. 1Gb link you should be able to saturate on bare metal and VM both directions using vmxnet3 driver , that will come with VMware tools.

All my VM's are using only one nic. I guess I should assign more Nics to various VM's?

Here is another problem.


I should also say that all other VM's are mostly idle. My biggest data user is the windows Veeam box but that backs up VM's in the middle of the night.

Then separate the bandwidth hog on a separate nic. Other VM with low usage you can put many on same nic.

EDIT: I didnt install VMware tools so I think I am using the E1000 nic by default?

That's true , but not what you want if you want 99% saturation. If you want about 80% then E1000 can still do. It does matter also a lot which direction the data is flowing.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
The slowdown has to do with network transfer speeds. I am surpised how little the VM uses for Rsync resources. The vm is giving me about 10mb speed for rsync, 30mb for CIFS - very slow as everything is 1gb on my network.

All my VM's are using only one nic. I guess I should assign more Nics to various VM's?

I should also say that all other VM's are mostly idle. My biggest data user is the windows Veeam box but that backs up VM's in the middle of the night.

EDIT: I didnt install VMware tools so I think I am using the E1000 nic by default?

EDIT2: I have a X11SSL-CF-O - it has 2 Nics - not 1. I was confused as my PFsense box has 4 nics.
30MBps is much slower than what you should be getting. I run the E1000 drivers and get >100MBps reading and writing on both of the two FreeNAS-on-ESXi All-in-Ones I own. One has 2 NICs and the other has 4; both are plugged in to a Dell PowerConnect 2816 switch, using 2 and 4 ports respectively set up as LAG groups. I've included screenshots of the ESXi network configuration below.

Never tried passing a NIC through to the FreeNAS VM, but in theory that should work too.

Never used more than a single NIC for a VM, either.

FreeNAS 9.10 automatically installs the Open VM tools, and this includes the VMXNET3 drivers. I installed and experimented with VMXNET3 last year and couldn't tell a dime's worth of difference between it and the E1000 drivers. YMMV.

Virtualizing FreeNAS takes some patience, planning, and study... I found this article at b3n.org very helpful: "FreeNAS 9.10 on VMWare ESXi 6.0 Guide.

Good luck!

network-1.jpg network-2.jpg network-3.jpg
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
What you first need to do is establish a repeatable benchmark test. It doesn't matter what it is so long as you are consistent. It could be CrystalDisk Mark if you like. Post the results. If they are really low then we need to know the physical build of your VM to ensure you have done it correctly and we are not making assumptions things are correct. Things I would want to know are how the hard drives are connected, what is the pool configuration, how much RAM did you give it, the CPU count, etc. You may need to post the .vmx file for the VM. Also, if the results are slow, turn off all your other VMs and try the test again. Same results would indicate it's not a NIC issue, even with the E1000 driver.

Take the troubleshooting one step at a time and be consistent with the testing environment.
 

RichTJ99

Patron
Joined
Sep 12, 2013
Messages
384
Thanks for all the replies - it has been very helpful!

For starters, I moved the freenas VM to the 2nd Nic on the motherboard (intel) - I did not pass it through but can if it could help.

A CIFS share from the VM freenas to a physical PC (all GB) on a 4gb ISO file gets me transfer speeds of about 80%-90% of the 1gb connection. So I would say moving freenas to its own nic was helpful (and since it was unplugged on the mobo gave it a purpose).

A CIFS share from the VM freenas to a VM on the same box as the freenas VM on a 4gb ISO file gets me transfer speeds of about 100% of the 1gb connection. So I would say moving freenas to its own nic was helpful (and since it was unplugged on the mobo gave it a purpose).

Both the Physical PC & VM PC were pulling data from the freenas box.

Sending a 1.5gb file from the Physical PC to the freenas VM was about 70% of the 1gb connection.
Sending a 1.5gb file from the VM PC to the freenas VM was about 80% of the 1gb connection.


So - the CIFS transfer speeds of 30 mb seem to be resolved - or resolved enough for me.


The Rsync transfer speeds of 3% of the 1gb connection are still a mystery to me. Its not Rsync SSH & all on the same local 1GB network.
http://i.imgur.com/oWw7711.jpg (EDIT: The high spikes were when i was testing CIFS transfers).

My phsyical freenas box gives me 100% of the 1GB connection during a cifs transfer.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
So what were your throughput's in Mbps?

A CIFS share from the VM freenas to a VM on the same box as the freenas VM on a 4gb ISO file gets me transfer speeds of about 100% of the 1gb connection. So I would say moving freenas to its own nic was helpful (and since it was unplugged on the mobo gave it a purpose).
This is not actually good. If you change to the VMNET3 driver vice the E1000, and remain on the same vswitch, you can get four times that speed. But if you are happy with it then that is all that really matters.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
I remember getting a fair big above 1Gbit speed using E1000 driver on both a windows VM and FreeNAS VM on the same box.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
So what were your throughput's in Mbps?


This is not actually good. If you change to the VMNET3 driver vice the E1000, and remain on the same vswitch, you can get four times that speed. But if you are happy with it then that is all that really matters.
This is true. I get ~3Gbps between VMs connecting to the same vswitch using the E1000 driver.
 
Joined
Nov 11, 2014
Messages
1,174
This is true. I get ~3Gbps between VMs connecting to the same vswitch using the E1000 driver.

Because the data is not going through the physical nics. Is flowing internally in the ESXI host , that's why you are getting these speeds.


For anybody interested in the difference between E1000 and VMXnet3 , here is a great simple article explaining it.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Because the data is not going through the physical nics. Is flowing internally in the ESXI host , that's why you are getting these speeds.
Exactly! I think that's the point @joeschmuck was making too.
 
Joined
Nov 11, 2014
Messages
1,174

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I'm just kind of curious why there was a need to move FreeNAS to a separate NIC. I have to assume that there was a substantial amount of traffic from the other VMs.

@RichTJ99
You may be able to improve that internal performance by selectively placing whatever VMs you desire on the same vswitch as the FreeNAS, then you will not have the NIC slowdown. Just a thought.

Exactly! I think that's the point @joeschmuck was making too.
Yup.

For anybody interested in the difference between E1000 and VMXnet3 , here is a great simple article explaining it.
I'll take it for a spin tomorrow. Getting late here now and I'm old so I need to go to bed before midnight. Yes, I have some TV to watch first.
 
Joined
Nov 11, 2014
Messages
1,174
I'll take it for a spin tomorrow. Getting late here now and I'm old so I need to go to bed before midnight. Yes, I have some TV to watch first.

Check it out. It's not long but very helpful to understand the difference, so you would know which is best used any particular scenario.
I always watch TV before bed (movies mostly) , but for me the night is just started. I don't get up 4:00 am like you:p
I am mid-age , but I am mostly a night bird expect for driving.
 
Joined
Nov 11, 2014
Messages
1,174
I'm just kind of curious why there was a need to move FreeNAS to a separate NIC. I have to assume that there was a substantial amount of traffic from the other VMs.

Who said it has to ?
I loose focus much easier when got older :-(
 

Goof

Cadet
Joined
Jul 28, 2016
Messages
1
With FreeNAS shell scripts executed as post-init tasks.

If you enable SSH on the ESXi server and set up 'passwordless' access with public keys, you can use VMware CLI commands to do pretty much anything you need to do on the ESXi server.

I have a set of startup scripts that:

> Force ESXi to re-scan all data stores. This makes the NFS or iSCSI FreeNAS datastores 'wake up' and become available to ESXi when FreeNAS is restarted.
> Start virtual machines

I also have shutdown scripts that gracefully powers down all running VMs when FreeNAS is shut down. This works great -- except with the current stable release of FreeNAS 9.10, though, oddly enough, the scripts work fine with the nightlys. So I anticipate that the next stable 9.10 release will fix the problem. I filed a bug report here (the actual problem is that networking - or name resolution, at least - is stopped before user shutdown tasks are executed) :

https://bugs.freenas.org/issues/15323

The shutdown scripts are available at the bug report, if you're interested in downloading and using them.

Here is the startup script I set up as a post-init task:
Code:
#!/bin/bash

L_LOGFILE="/mnt/tank/sysadmin/log/vmware-startup.log"
L_WAIT_DELAY=30

echo "$(date): Force datastore re-scan on FELIX..." > $L_LOGFILE

ssh root@felix.ncs esxcli storage core adapter rescan --all >> $L_LOGFILE

echo "Pausing $L_WAIT_DELAY seconds before starting virtual machines..." >> $L_LOGFILE
sleep $L_WAIT_DELAY
/mnt/tank/sysadmin/start-virtual-machine.sh root felix adonis  >> $L_LOGFILE
/mnt/tank/sysadmin/start-virtual-machine.sh root felix aphrodite  >> $L_LOGFILE

exit

And here is the script that starts up a given VM ('start-virtual-machine.sh'):
Code:
#!/bin/bash

################################################################################
# Usage: start-virtual-machine.sh user_id esxi_host_name vmx_base_filename
#
# Starts guest virtual machine with vmx file (vmx_base_filename'.vmx') on remote
# VMware ESXi server (esxi_host_name) using given user credentials (user_id)
#
# Sends commands to the remote host using SSH, which must be configured before
# using this script.
#
# Tested with FreeNAS 9.3 (STABLE) running as a VM on VMware ESXi v6.0
################################################################################

# Check for usage errors

if [ $# -ne 3 ]
then
  echo "$0: error! Not enough arguments"
  echo "Usage is: $0 user_id esxi_host_name vmx_filename"
  echo "Only specify the vmx basefilename; leave off the '.vmx' extension"
  exit 1
fi

# Gather command-line arguments for user ID, hostname, and datastore name:

L_USER=$1
L_HOST=$2
L_VMXNAME=$3

# Get server ID for the VM with matching vmx file:

L_GUEST_VMIDS=$(ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms | grep "/${L_VMXNAME}.vmx" | awk '$1 ~ /^[0-9]+$/ {print $1}')

echo "$0: $L_USER@$L_HOST vmx=$L_VMXNAME.vmx"

for L_VMID in $L_GUEST_VMIDS
do
  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.getstate $L_VMID | grep -i "off\|Suspended" > /dev/null 2<&1
  L_SHUTDOWN_STATUS=$?

  if [ $L_SHUTDOWN_STATUS -ne 0 ]; then
  echo "Guest VM ID $L_VMID already powered up..."
  else
  echo "Powering up guest VM ID $L_VMID..."
  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.on $L_VMID
  fi
done

exit

Excellent shutdown script, this is just what I've been looking for! However.... I found if I had several VMs running without vmware-tools, or if they had hung on shutdown for whatever reason this script took ages to finish. It would only start shutting down a VM once the previous one had been fully shutdown. This meant I had to sit through the timeout period for several iterations, also not good if running on battery and the host shutdown timeout is looming.

I found a script online that promised to provide basic multitasking with sync and managed to shoehorn it into the script. With my very basic BASH skills and lots of help from Prof. Google I also upgraded the script to allow specifying multiple datastores to search for VMs. Now the script does the following:
  • Looks for datastores as input and queries EXSi as to what VMs are on them
  • Queries EXSi which of these VMs are actually running
  • Spawns a bunch of child scripts that run Spearfoot's original safe shutdown monitor, forcing the VMs to poweroff if they don't shut down in time.
Basically now the delay upon running this script with lots of VMs online is more predictable, and is generally not much longer than the timeout you set in the script for one VM to shutdown.

This script has been running perfectly on both my testing ESXi setup and the real host it lives on, haven't noticed any dramas so far. I do recommend you have a look through the script and test it well before you trust VMs with it, my limited BASH skills might mean I've overlooked some edge case or side effect of these modifications...

Script is still invoked the same, this time though you can add more datastores to the end if you have more than one you want to monitor:
Code:
./shut-down-virtual-machines.sh user_id esxi_host_name datastore_name [datastore_name]...

Code:
#!/bin/bash

################################################################################
# Usage: shut-down-virtual-machines.sh user_id esxi_host_name datastore_name [datastore_name]...
#
# Gracefully powers down all guest virtual machines in a particular datastore (datastore_name)
# on remote VMware ESXi server (esxi_host_name) using given user credentials (user_id).
#
# VMware tools must be installed on the guest VMs; otherwise we can't power them down
# gracefully with the 'shut down' feature.
#
# Sends commands to the remote ESXi host using SSH, which must be configured before
# using this script.
#
# The datastore name is forced to uppercase, as this seems to be the norm for ESXi.
#
# This script is handy for users running FreeNAS virtualised on VMware ESXi. In this
# scenario, FreeNAS is homed on a local datastore different from the datastore(s) it
# provides to ESXi. So this script can safely be used to shut down all of the VM guests
# on the FreeNAS datastore(s) without shutting down FreeNAS itself, thus enabling backing
# up the VM files while they are in a quiescent state.
#
# Borrows heavily from the 'ESXi Auto Shutdown Script' available on GitHub:
#
# https://github.com/sixdimensionalarray/esxidown
#
# Original script by Spearfoot of the FreeNAS forums, modifications by Goof.
#
# Bash Multitasking module by Daniel Botelho:
# http://blog.dbotelho.com/2009/01/multithreading-with-bash-script/
#
# Tested with FreeNAS 9.10.2-U2 (STABLE) running as a VM on VMware ESXi v6.5
################################################################################

# Check for usage errors

if [ $# -lt 3 ]
then
   echo "$0: error! Not enough arguments"
   echo "Usage is: $0 user_id esxi_host_name datastore_name [datastore_name]..."
   exit 1
fi

# Test flag: set to 1 to prevent the script from shutting VM guests down.
L_TEST=0

# Gather command-line arguments for user ID, hostname, and datastore name(s):

L_USER=$1
L_HOST=$2
#Cut away the 1st and 2nd arguments and store the rest to be used later
L_DATASTORE=$(echo $@ | cut -d\  -f 3-)
L_DATASTORE=${L_DATASTORE^^}

# L_WAIT_TRYS determines how many times the script will loop while attempting to power down a VM.
# L_WAIT_TIME specifies how many seconds to sleep during each loop.

L_WAIT_TRYS=30
L_WAIT_TIME=3

# MAX_WAIT is the product of L_WAIT_TRYS and L_WAIT_TIME, i.e., the total number of seconds we will
# wait before gracelessly forcing the power off.

MAX_WAIT=$((L_WAIT_TRYS*L_WAIT_TIME))

# For tests, force the retry max count to 1
if [ $L_TEST -eq 1 ]; then
  L_WAIT_TRYS=1
fi

# Record keeping:

L_TOTAL_VMS=0
L_TOTAL_VMS_SHUTDOWN=0
L_TOTAL_VMS_POWERED_DOWN=0

# Get server IDs for all VMs stored on the indicated datastore. These IDs change between
# boots of the ESXi server, so we have to work from a fresh list every time. We are only
# interested in the guests stored in '[DATASTORE]' and the brackets are important.

echo "$(date): $0 $L_USER@$L_HOST datastore=$L_DATASTORE Max wait time=$MAX_WAIT seconds"
echo -e "\nFull list of VM guests on this server:"
ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms

#Iterate through each datastore and append which VMs live there to $L_GUEST_VMS
for STORE in ${L_DATASTORE}; do
echo -e "\n"
L_GUEST_VMIDS="$L_GUEST_VMIDS $(ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms | grep "\[${STORE}\]" | awk '$1 ~ /^[0-9]+$/ {print $1}')"
echo "VM guests on datastore ${STORE}:"
ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms | grep "\[${STORE}\]"
done

echo -e "\n"

# Function for validating shutdown

validate_shutdown()
{
  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.getstate $L_VMID | grep -i "off" > /dev/null 2<&1
  L_SHUTDOWN_STATUS=$?
  if [ $L_SHUTDOWN_STATUS -ne 0 ]; then
	if [ $L_TRY -lt $L_WAIT_TRYS ]; then
	  # if the vm is not off, wait for it to shut down
	  L_TRY=$((L_TRY + 1))
	  status=0
	  if [ $L_TEST -eq 0 ]; then
		echo "Waiting for guest VM ID $L_VMID to shutdown (attempt $L_TRY of $L_WAIT_TRYS)..."
		sleep $L_WAIT_TIME
	  else
		echo "TEST MODE: Waiting for guest VM ID $L_VMID to shutdown (attempt $L_TRY of $L_WAIT_TRYS)..."
	  fi
	  validate_shutdown
	else
	  # force power off and wait a little (you could use vmsvc/power.suspend here instead)
	  status=1
	  if [ $L_TEST -eq 0 ]; then
		echo "Unable to gracefully shutdown guest VM ID $L_VMID... forcing power off."
		ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.off $L_VMID
		sleep $L_WAIT_TIME
	  else
		echo "TEST MODE: Unable to gracefully shutdown guest VM ID $L_VMID... forcing power off."
	  fi
	fi
  fi
  exit $status
}

shutdown_vm()
{
  L_VMID=$1
  L_TRY=0

  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.getstate $L_VMID | grep -i "off\|Suspended" > /dev/null 2<&1
  L_SHUTDOWN_STATUS=$?

  if [ $L_SHUTDOWN_STATUS -ne 0 ]; then
	if [ $L_TEST -eq 0 ]; then
	  echo "Attempting shutdown of guest VM ID $L_VMID..."
	  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.shutdown $L_VMID
	else
	  echo "TEST MODE: Attempting shutdown of guest VM ID $L_VMID..."
	fi
	validate_shutdown
  else
	echo "Guest VM ID $L_VMID already powered down..."
	exit 2
  fi
}

#Multitasking script dragged in here against its own will
function jobidfromstring()
{
local STRING;
local RET;

STRING=$1;
RET="$(echo $STRING | sed 's/^[^0-9]*//' | sed 's/[^0-9].*$//')"

echo $RET;
}

runmultitask()
{
JOBLIST=""
TASKLIST[0]=""
i=0

for TASK in "$@"
do
L_TOTAL_VMS=$((L_TOTAL_VMS + 1))
shutdown_vm $TASK &
LASTJOB=`jobidfromstring $(jobs %%)`
JOBLIST="$JOBLIST $LASTJOB"
TASKLIST[$i]=$TASK
i=$(($i+1))
done

i=0
for JOB in $JOBLIST ; do
wait %$JOB
status=$?
if [ $status -eq 1 ]; then
  echo "Guest VM ID ${TASKLIST[$i]} has been forcibly powered off"
  L_TOTAL_VMS_POWERED_DOWN=$((L_TOTAL_VMS_POWERED_DOWN + 1))
elif [ $status -eq 0 ]; then
  echo "Guest VM ID ${TASKLIST[$i]} has been gracefully shutdown"
  L_TOTAL_VMS_SHUTDOWN=$((L_TOTAL_VMS_SHUTDOWN + 1))
fi
i=$(($i+1))
done
}

# Iterate over the list of guest VMs, shutting down any that are powered up
runmultitask $L_GUEST_VMIDS

echo -e "\nFound $L_TOTAL_VMS virtual machine guests on $L_HOST datastore(s) $L_DATASTORE"
echo "   Total shut down: $L_TOTAL_VMS_SHUTDOWN"
echo "Total powered down: $L_TOTAL_VMS_POWERED_DOWN"
echo -e "\n$(date): $0 completed"

exit

 
Last edited:

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Excellent shutdown script, this is just what I've been looking for! However.... I found if I had several VMs running without vmware-tools, or if they had hung on shutdown for whatever reason this script took ages to finish. It would only start shutting down a VM once the previous one had been fully shutdown. This meant I had to sit through the timeout period for several iterations, also not good if running on battery and the host shutdown timeout is looming.

I found a script online that promised to provide basic multitasking with sync and managed to shoehorn it into the script. With my very basic BASH skills and lots of help from Prof. Google I also upgraded the script to allow specifying multiple datastores to search for VMs. Now the script does the following:
  • Looks for datastores as input and queries EXSi as to what VMs are on them
  • Queries EXSi which of these VMs are actually running
  • Spawns a bunch of child scripts that run Spearfoot's original safe shutdown monitor, forcing the VMs to poweroff if they don't shut down in time.
Basically now the delay upon running this script with lots of VMs online is more predictable, and is generally not much longer than the timeout you set in the script for one VM to shutdown.

This script has been running perfectly on both my testing ESXi setup and the real host it lives on, haven't noticed any dramas so far. I do recommend you have a look through the script and test it well before you trust VMs with it, my limited BASH skills might mean I've overlooked some edge case or side effect of these modifications...

Script is still invoked the same, this time though you can add more datastores to the end if you have more than one you want to monitor:
Code:
./shut-down-virtual-machines.sh user_id esxi_host_name datastore_name [datastore_name]...

Code:
#!/bin/bash

################################################################################
# Usage: shut-down-virtual-machines.sh user_id esxi_host_name datastore_name [datastore_name]...
#
# Gracefully powers down all guest virtual machines in a particular datastore (datastore_name)
# on remote VMware ESXi server (esxi_host_name) using given user credentials (user_id).
#
# VMware tools must be installed on the guest VMs; otherwise we can't power them down
# gracefully with the 'shut down' feature.
#
# Sends commands to the remote ESXi host using SSH, which must be configured before
# using this script.
#
# The datastore name is forced to uppercase, as this seems to be the norm for ESXi.
#
# This script is handy for users running FreeNAS virtualised on VMware ESXi. In this
# scenario, FreeNAS is homed on a local datastore different from the datastore(s) it
# provides to ESXi. So this script can safely be used to shut down all of the VM guests
# on the FreeNAS datastore(s) without shutting down FreeNAS itself, thus enabling backing
# up the VM files while they are in a quiescent state.
#
# Borrows heavily from the 'ESXi Auto Shutdown Script' available on GitHub:
#
# https://github.com/sixdimensionalarray/esxidown
#
# Original script by Spearfoot of the FreeNAS forums, modifications by Goof.
#
# Bash Multitasking module by Daniel Botelho:
# http://blog.dbotelho.com/2009/01/multithreading-with-bash-script/
#
# Tested with FreeNAS 9.10.2-U2 (STABLE) running as a VM on VMware ESXi v6.5
################################################################################

# Check for usage errors

if [ $# -lt 3 ]
then
   echo "$0: error! Not enough arguments"
   echo "Usage is: $0 user_id esxi_host_name datastore_name [datastore_name]..."
   exit 1
fi

# Test flag: set to 1 to prevent the script from shutting VM guests down.
L_TEST=0

# Gather command-line arguments for user ID, hostname, and datastore name(s):

L_USER=$1
L_HOST=$2
#Cut away the 1st and 2nd arguments and store the rest to be used later
L_DATASTORE=$(echo $@ | cut -d\  -f 3-)
L_DATASTORE=${L_DATASTORE^^}

# L_WAIT_TRYS determines how many times the script will loop while attempting to power down a VM.
# L_WAIT_TIME specifies how many seconds to sleep during each loop.

L_WAIT_TRYS=30
L_WAIT_TIME=3

# MAX_WAIT is the product of L_WAIT_TRYS and L_WAIT_TIME, i.e., the total number of seconds we will
# wait before gracelessly forcing the power off.

MAX_WAIT=$((L_WAIT_TRYS*L_WAIT_TIME))

# For tests, force the retry max count to 1
if [ $L_TEST -eq 1 ]; then
  L_WAIT_TRYS=1
fi

# Record keeping:

L_TOTAL_VMS=0
L_TOTAL_VMS_SHUTDOWN=0
L_TOTAL_VMS_POWERED_DOWN=0

# Get server IDs for all VMs stored on the indicated datastore. These IDs change between
# boots of the ESXi server, so we have to work from a fresh list every time. We are only
# interested in the guests stored in '[DATASTORE]' and the brackets are important.

echo "$(date): $0 $L_USER@$L_HOST datastore=$L_DATASTORE Max wait time=$MAX_WAIT seconds"
echo -e "\nFull list of VM guests on this server:"
ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms

#Iterate through each datastore and append which VMs live there to $L_GUEST_VMS
for STORE in ${L_DATASTORE}; do
echo -e "\n"
L_GUEST_VMIDS="$L_GUEST_VMIDS $(ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms | grep "\[${STORE}\]" | awk '$1 ~ /^[0-9]+$/ {print $1}')"
echo "VM guests on datastore ${STORE}:"
ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/getallvms | grep "\[${STORE}\]"
done

echo -e "\n"

# Function for validating shutdown

validate_shutdown()
{
  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.getstate $L_VMID | grep -i "off" > /dev/null 2<&1
  L_SHUTDOWN_STATUS=$?
  if [ $L_SHUTDOWN_STATUS -ne 0 ]; then
	if [ $L_TRY -lt $L_WAIT_TRYS ]; then
	  # if the vm is not off, wait for it to shut down
	  L_TRY=$((L_TRY + 1))
	  status=0
	  if [ $L_TEST -eq 0 ]; then
		echo "Waiting for guest VM ID $L_VMID to shutdown (attempt $L_TRY of $L_WAIT_TRYS)..."
		sleep $L_WAIT_TIME
	  else
		echo "TEST MODE: Waiting for guest VM ID $L_VMID to shutdown (attempt $L_TRY of $L_WAIT_TRYS)..."
	  fi
	  validate_shutdown
	else
	  # force power off and wait a little (you could use vmsvc/power.suspend here instead)
	  status=1
	  if [ $L_TEST -eq 0 ]; then
		echo "Unable to gracefully shutdown guest VM ID $L_VMID... forcing power off."
		ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.off $L_VMID
		sleep $L_WAIT_TIME
	  else
		echo "TEST MODE: Unable to gracefully shutdown guest VM ID $L_VMID... forcing power off."
	  fi
	fi
  fi
  exit $status
}

shutdown_vm()
{
  L_VMID=$1
  L_TRY=0

  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.getstate $L_VMID | grep -i "off\|Suspended" > /dev/null 2<&1
  L_SHUTDOWN_STATUS=$?

  if [ $L_SHUTDOWN_STATUS -ne 0 ]; then
	if [ $L_TEST -eq 0 ]; then
	  echo "Attempting shutdown of guest VM ID $L_VMID..."
	  ssh ${L_USER}@${L_HOST} vim-cmd vmsvc/power.shutdown $L_VMID
	else
	  echo "TEST MODE: Attempting shutdown of guest VM ID $L_VMID..."
	fi
	validate_shutdown
  else
	echo "Guest VM ID $L_VMID already powered down..."
	exit 2
  fi
}

#Multitasking script dragged in here against its own will
function jobidfromstring()
{
local STRING;
local RET;

STRING=$1;
RET="$(echo $STRING | sed 's/^[^0-9]*//' | sed 's/[^0-9].*$//')"

echo $RET;
}

runmultitask()
{
JOBLIST=""
TASKLIST[0]=""
i=0

for TASK in "$@"
do
L_TOTAL_VMS=$((L_TOTAL_VMS + 1))
shutdown_vm $TASK &
LASTJOB=`jobidfromstring $(jobs %%)`
JOBLIST="$JOBLIST $LASTJOB"
TASKLIST[$i]=$TASK
i=$(($i+1))
done

i=0
for JOB in $JOBLIST ; do
wait %$JOB
status=$?
if [ $status -eq 1 ]; then
  echo "Guest VM ID ${TASKLIST[$i]} has been forcibly powered off"
  L_TOTAL_VMS_POWERED_DOWN=$((L_TOTAL_VMS_POWERED_DOWN + 1))
elif [ $status -eq 0 ]; then
  echo "Guest VM ID ${TASKLIST[$i]} has been gracefully shutdown"
  L_TOTAL_VMS_SHUTDOWN=$((L_TOTAL_VMS_SHUTDOWN + 1))
fi
i=$(($i+1))
done
}

# Iterate over the list of guest VMs, shutting down any that are powered up
runmultitask $L_GUEST_VMIDS

echo -e "\nFound $L_TOTAL_VMS virtual machine guests on $L_HOST datastore(s) $L_DATASTORE"
echo "   Total shut down: $L_TOTAL_VMS_SHUTDOWN"
echo "Total powered down: $L_TOTAL_VMS_POWERED_DOWN"
echo -e "\n$(date): $0 completed"

exit

Nice! When I have some free time... I'll see about incorporating your modifications to my ESXi scripts located on GitHub.
 
Status
Not open for further replies.
Top