Category Archives: vSphere

powercli

Conquering ESXi upgrades with conflicting VIBs using PowerCLI

Today, I ran into an issue where I was upgrading ESXi 6.0 servers to 6.5 Update 1 using an HPE custom ISO.   Here’s another example of how PowerCLI can make you more productive.

Conflicting VIBs problem

While working with a customer on a vSphere 6.0 to 6.5 upgrade, I prepared everything as it should be.  I got the latest custom ISO from HPE for ESXi 6.0 Update 1, created a VUM baseline, and attached it to the clusters in question.  Upon scanning the ESXi hosst with VMware Update Manager, I received a warning that the HPE custom ISO was incompatible.

esxi upgrade conflicting vibs

Note that there aren’t actually four conflicting VIBS.  It repeated the problematic modules twice.  There’s actually only two.

Basically, these conflicting modules should be removed prior to upgrading the ESXi hosts.

Removing conflicting VIBs the manual way

There’s nothing special about how to remove them via ESXCLI.  You need the name of the conflicting module, and enable SSH on the ESXi hosts.  Then, run the following command:

esxcli software vib remove –vibname conflicting-vib-name

In the case above, they are named scsi-qla2xxx and scsi-lpfc820.

Watch for indications if the server needs to be rebooted when you run the command.  If so, reboot the servers before proceeding with the upgrade.

Removing conflicting VIBs the PowerCLI way

It’s even easier with PowerCLI to remove these conflicting VIBs.  You don’t have to enable SSH on all your ESXi hosts.  First, make a text file with the names of each conflicting VIB name, with one name per line.

Next, run the following commands after connecting to your vCenter server via PowerCLI:

$modules = gc c:\scripts\modulesnames.txt
$modules
scsi-qla2xxx
scsi-lpfc820

$esxi = get-vmhost "esxihost.domain.com"
$esxcli = get-esxcli -V2 -VMhost $esxi
$modules | foreach-object{$esxcli.software.vib.remove.Invoke(@{"vibname" = "$_"})}

You could obviously make a variable of all your ESXi hosts and do them all at once, but you might not want to leave your ESXi hosts sitting there waiting for a reboot for a while.  It’s your call how to handle that part, but this is how you can remove conflicting VIBs at a basic level.

Hope this helps!

VMware NTP timekeeping considerations in depth – Intro

Accurate timekeeping is important in almost every environment.  If time is not synced across your environment, authentication errors can occur, services and applications may not function properly, event logs and alerts can be off, which can inhibit troubleshooting.  You’re probably aware already that this is a big deal.  Beyond just referencing KB articles, I want to spend time to discuss NTP timekeeping in general, as well as practical methods and strategies that work, and in my experience what doesn’t work.

This will be a series of posts to try to address all major considerations with timekeeping via NTP, beginning with timekeeping within virtual machines.

NTP – Accuracy vs. Internal Synchronization

Obviously, you need your internal to have accurate time and synced with authentication sources.  Protocols like Kerberos for good reason don’t allow for much clock skewing in order to protect against authentication replay attacks. For example, Active Directory’s default tolerance for clock skewing is five minutes.

But sometimes both of those goals conflict each other.  In these cases, which is more important?  For probably almost all environments that the priority should be that clocks are synced over how accurate the clocks actually are.

Why?  Simple – application and service availability.  Chances are, if clocks are skewed too much within your environment, services and applications will become inaccessible to some or all users.  Generally, razor sharp clock accuracy in the real world if lacking can often be an annoyance, not a downtime event.  Obviously, that may not be the case for everyone, such as real time stock trading companies, but that’s generally for the most part true.  When making choices about how to configure things for time usually through NTP, if faced with a scenario where you must choose better internal synchronization instead of better accuracy to what the real time is, choose better synchronization over actual time accuracy.

When would these goals come into conflict?  As an example, VMs could be set to synchronize their clocks with their VM host via VM Tools, or they could be configured within their OS to use an external NTP server.  It’s theoretically possible that for some reason, your ESXi host’s clock might be more trustworthy than your Domain Controllers more often than not.  For most customers though, even if that were true, prioritize synchronization over clock accuracy.  Allowing VMTools to sync the clock of the VM to the host effectively means VMs running on different hosts could have different time.  Maybe the NTP service stopped on one ESXi host.  Maybe they’re not configured consistently.  It doesn’t matter why.  Prioritize synchronization instead by configuring each VM’s OS to synchronize to the same NTP servers somehow, some way.

How many NTP servers, and which ones?

When configuring anything for NTP, whether it be an ESXi server or  guests, the question always comes up – how many servers should an NTP client be set to use?

Many people know some obvious ones.  More than one, right?  Of course.  Providing more than one offers redundancy in case an NTP server fails.  However, I’ve encountered many environments where there were just two configured.  Of course three would be better just for resiliency, but configuring two NTP servers has risks beyond that.

Remember that NTP clients function by polling all their configured NTP servers, and then adopting the most consensus time values across all of them.  For example, if two NTP servers configured provide different values, the NTP client will adopt a value that’s a compromise between them.  In a scenario where NTP server 1 says the time is off by twenty minutes, but NTP server 2 is correct, the NTP client will likely to adopt a value of 10 minutes too fast, which is incorrect, and worse may cause clock skewing within the environment.  I recommend you use instead an odd number of NTP servers greater than one, and the more the merrier generally speaking.

But which ones?  Diversity that improves availability is good, but diversity that will be more likely to result in disparate values is bad.  Using NTP servers that are for example on separate compute, storage, and physical sites is good.  Mixing and matching for example internal and external NTP servers that are managed by different people on the same NTP client is generally bad, although it might be the best alternative among non-optimal choices.

In my next post in this series, I’ll go into specifics on how I generally apply these considerations to VMware environments.

vSphere Networking Clarifications

I get a lot of questions about the finer points of vSphere networking.  I wanted to provide a consolidated list of general recommendations and info about some of the options within vSphere networking as a quick reference.

Please keep in mind recommendations below for vSphere networking are extremely vague and generally speaking.  You should not necessarily assume any recommendations below are the right choice for your environment!

  • When using Virtual Distributed Switches (VDS) in vSphere networking configurations, what should distributed port group port binding be set to – static, dynamic, or ephemeral?
    • Short answer – Generally use static for most workloads, and consider ephemeral for infrastructure type workloads related to vCenter, such as vCenter, PSCs, domain controllers/DNS servers, etc.
    • Long answer – Dynamic was deprecated in vSphere 5.0.  Don’t use it.  Use static or ephemeral.  Static has a few advantages; a VM will always stay on the same virtual switch port even if powered off, so statistics are easier to get for a particular virtual NIC.  However, VMs will consume more VDS ports since they will consume them even when powered off, but that’s a lot of ports to eat through.  It also generally results in less load on vCenter and ESXi hosts, as ports aren’t constantly allocated and unallocated.  Ephemeral basically is no binding at all.  This reduces the number of VDS ports consumed, as powered off VMs don’t use up VDS ports.  However, ephemeral slows down operations within the VDS as ports are allocated and unallocated when VMs are powered off and on.  One advantage ephemeral does have is it does not require vCenter to be available for VMs to make use of those ports; static sometimes does, hence the recommendation for vCenter and related workloads to use ephemeral port bindings to avoid chicken and egg type scenarios, as vCenter is the control plane for the VDS, and is therefore mainly responsible for port bindings.
  • When using Virtual Distributed Switches (VDS) in vSphere networking configurations, what should distributed port group port allocations be set to – Elastic or Static?
    • Generally, use the default 8 for number of ports, and set allocation to elastic.  This should keep the number of unused virtual switch ports to a minimum while allowing the port groups to scale up and down as needed.
  • What load balancing should be used in vSphere networking – Virtual Port ID, MAC hash, IP Hash, LACP, or Load Based Teaming?
    • Short answer – there’s no right answer for everyone.  Read the long answer.
    • Long answer – There are MANY MANY considerations when selecting between load balancing.  I want to throw out some cavaets first, and then give some general hand rules.  Take these into account before reading my general recommendations!  One or more of these might force you in a specific directions or rule out some of options. They should be considered first and foremost before the general recommendations.
      • Caveats
        • You can only use LACP and Load Based Teaming if you’re using VDS.
        • If you want to use port mirroring for any reason, LACP doesn’t support it.
        • If you’re using Host Profiles to configure host networking, LACP can’t be configured using them, an important consideration when using stateless autodeploy.
        • The only two load balancing modes that can in anyway grant a single vNIC more bandwidth than a single physical NIC in the team are LACP and IP Hash.
        • Both LACP and IP Hash require special switch port configurations.
        • LACP does not support beacon probing.
        • The presence of NSX drastically impacts which modes you can, can’t, should, and shouldn’t use.
          • Load based teaming is not supported with logical switching or edge gateways.
      • General recommendations
        • If you are using standard switches, and no VM’s vNIC requires more bandwidth than a single physical NIC in the team, use virtual port ID.  It keeps CPU utilization the lowest, and requires no special switch configuration, making it generally less problematic than IP Hash.
        • If you’re using Virtual Distributed Switches, and no VM’s vNIC requires more bandwidth than a single physical NIC in the team, use Load Based teaming.  While it costs more in CPU than Virtual Port ID, it provides a worthwhile enhancement to network performance via better load balancing, and it also requires no special switch configuration, making it generally less problematic than IP Hash and LACP.
  • Should Network I/O Control (NIOC) be used in vSphere networking?
    • If you are converging host management, VM, vMotion, Fault Tolerance, and/or Storage on the same physical NICs, NIOC should be enabled.  NIOC helps ensure that no traffic type can overwhelm the others.  This is especially important when it comes to IP storage traffic sharing physical links with other traffic, regardless if it’s iSCSI, NFS, VSAN, or other hyperconverged traffic such as Nutanix.
    • If you aren’t converging different types of traffic on the same physical NICs, there’s little reason not to enable NIOC assuming you’re using Virtual Distributed Switches.
  • Beacon Probing looks like a better failover detection in vSphere Networking.  Should I use it?
    • Generally speaking, no.  It requires three uplinks within the team minimum to use.  Generally, making use of Link State Tracking within switches if possible is a better solution.

I’ll add more, as I get more questions about vSphere networking.

Windows vCenter end of the line is coming

Hello everyone!

Greetings from Las Vegas for VMworld 2017.  I wanted to do a better job of blogging from here this year.

One update I wanted to pass along is VMware has announced that the Windows vCenter is coming to an end.

The next numbered version of vCenter will be the last Windows version f vCenter.  The one after that will be the VCSA appliance only.

I’ve been recommending the appliance since 6.0, as it has performed better and is easier to maintain.  This is an even bigger reason to go to it.

More news from VMworld to come!

powercli script configure vcenter alarm email actions

PowerCLI Script to Configure vCenter Alarm Email Actions

Have you ever gone through your vCenter and configured alarms to email?  If you have, you know that if anything ever screamed for automation within vSphere, it’s this, as it is extremely tedious.  You really want to use a PowerCLI script to masnage vCenter alarm email actions!

For one, the action must be set individually on each alarm. On top of that, you configure each alarm with the email address(es) you want  to be sent alerts.  You can also set repeated email actions on each alarm.  While this provides granularity and customization opportunities, it creates a boring snoozefest, inviting confusion and human error in configuring them.  Plus, vCenter contains many alerts, and many people do not know which ones you should configure for emails, and how critical each one is.

Introduction to PowerCLI Script to Configure vCenter Alarm Email Actions

I found a good script about a year ago to do this with PowerCLI script using a CSV of the alarms.  I cannot seem to find who made it now.  If you stumble on this and know who originally made the similar script, please comment below.  I very much want people recognized for their hard work.

Not to simply take the script, I added additional functionality to it, to provide end to end configuration, including vCenter SMTP email server settings, as well as the ability to configure multiple vCenter servers in one fell swoop.  Also, I really love the design of this script for several reasons:

  • Simplicity rules this script.  Even if you want the script to do something else, it’s easy to follow and adapt.
  • One can easily adapt the script to new versions of vSphere.  Each version of vSphere adds, removes, or changes alarms.  I can easily dump those alarm names into a new CSV and set their values.
  • It provides a three-tiered priority system for email frequency.   Don’t like them?  It’s easy to change them within the script.
  • Ongoing maintenance of the alerts is a snap.  Change the values within the script, and simply rerun it.
  • The CSV file provides an opportunity to track changes to alerts and a deliverable document.  I intend also to maintain the CSV files here as I receive feedback on the alarms.

Ready to use a PowerCLI script to configure vCenter alarm email actions?

Directions for PowerCLI Script to Configure vCenter Alarm Email Actions

This PowerCLI script to configure vCenter alarm email actions is very straight forward.  You simply set the variables at the beginning of the script, which includes info such as vCenter servers, the CSV file to be used, vCenter user name and password, the SMTP server and port to use, etc.

Next, set the values within the CSV according to how you want that alert configured within the Priority column.  The values are as follows :

  • Low Priority – Removal of all email alert actions, and reconfigured to send one email without repeat emails for non-critical alerts.
  • Medium Priority – Removal of all email alert actions, and reconfigured to send one repeated email daily for more serious alerts.
  • High Priority – Removal of all email alert actions, and reconfigured to send one repeated email every four hours for more serious alerts.  (I originally set this to hourly, but customer feedback said this was far too much.)
  • Disabled – Removal of all email alert actions.  Use this on alerts that are far too chatty (looking at you VM CPU and memory usage!), and you wish to turn emails for them off.
  • Blank (aka no value) – Leave the alarm as is in case you manually configured the alert and wish to keep it that way.

$vcenterservers = "vcenter1.vs6lab.local","vcenter2.vs6lab.local"
$vcenterusername = 'administrator@vsphere.local'
$vcenteruserpwd = 'P@ssw0rd'
$alarmfile = import-csv c:\scripts\vsphere60-alarms.csv
$AlertEmailRecipients = @("email1@domain.com","email2@domain.com")
$SMTPServer = "FQDNorIP.domain.com"
$SMTPPort = "25"
$SMTPSendingAddress = "sender@domain.com"

#Import PowerCLI module
import-module -name VMware.PowerCLI

#----These Alarms will be disabled and not send any email messages at all ----
$DisabledAlarms = $alarmfile | where-object priority -EQ "Disabled"

#----These Alarms will send a single email message and not repeat ----
$LowPriorityAlarms = $alarmfile | where-object priority -EQ "low"

#----These Alarms will repeat every 24 hours----
$MediumPriorityAlarms = $alarmfile | where-object priority -EQ "medium"

#----These Alarms will repeat every 4 hours----
$HighPriorityAlarms = $alarmfile | where-object priority -EQ "high"

foreach ($vcenterserver in $vcenterservers){
Disconnect-VIServer -Confirm:$False
Connect-VIserver $vcenterserver -User $vcenterusername -Password $vcenteruserpwd
Get-AdvancedSetting -Entity $vcenterserver -Name mail.smtp.server | Set-AdvancedSetting -Value $SMTPServer -Confirm:$false
Get-AdvancedSetting -Entity $vcenterserver -Name mail.smtp.port | Set-AdvancedSetting -Value $SMTPPort -Confirm:$false
Get-AdvancedSetting -Entity $vcenterserver -Name mail.sender | Set-AdvancedSetting -Value $SMTPSendingAddress -Confirm:$false

#---Disable Alarm Action for Disabled Alarms---
Foreach ($DisabledAlarm in $DisabledAlarms) {
Get-AlarmDefinition -Name $DisabledAlarm.name | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
}

#---Set Alarm Action for Low Priority Alarms---
Foreach ($LowPriorityAlarm in $LowPriorityAlarms) {
Get-AlarmDefinition -Name $LowPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
Get-AlarmDefinition -Name $LowPriorityAlarm.name | New-AlarmAction -Email -To @($AlertEmailRecipients)
Get-AlarmDefinition -Name $LowPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Green" -EndStatus "Yellow"
#Get-AlarmDefinition -Name $LowPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Red" # This ActionTrigger is enabled by default.
Get-AlarmDefinition -Name $LowPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Red" -EndStatus "Yellow"
Get-AlarmDefinition -Name $LowPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Green"
}

#---Set Alarm Action for Medium Priority Alarms---
Foreach ($MediumPriorityAlarm in $MediumPriorityAlarms) {
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Set-AlarmDefinition -ActionRepeatMinutes (60 * 24) # 24 Hours
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | New-AlarmAction -Email -To @($AlertEmailRecipients)
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Green" -EndStatus "Yellow"
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | Get-AlarmActionTrigger | Select -First 1 | Remove-AlarmActionTrigger -Confirm:$false
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Red" -Repeat
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Red" -EndStatus "Yellow"
Get-AlarmDefinition -Name $MediumPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Green"
}

#---Set Alarm Action for High Priority Alarms---
Foreach ($HighPriorityAlarm in $HighPriorityAlarms) {
Get-AlarmDefinition -Name $HighPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail| Remove-AlarmAction -Confirm:$false
Get-AlarmDefinition -name $HighPriorityAlarm.name | Set-AlarmDefinition -ActionRepeatMinutes (60 * 4) # 4 hours
Get-AlarmDefinition -Name $HighPriorityAlarm.name | New-AlarmAction -Email -To @($AlertEmailRecipients)
Get-AlarmDefinition -Name $HighPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Green" -EndStatus "Yellow"
Get-AlarmDefinition -Name $HighPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | Get-AlarmActionTrigger | Select -First 1 | Remove-AlarmActionTrigger -Confirm:$false
Get-AlarmDefinition -Name $HighPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Red" -Repeat
Get-AlarmDefinition -Name $HighPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Red" -EndStatus "Yellow"
Get-AlarmDefinition -Name $HighPriorityAlarm.name | Get-AlarmAction -ActionType SendEmail | New-AlarmActionTrigger -StartStatus "Yellow" -EndStatus "Green"
}
}

File downloads:

PowerCLI script to configure vCenter alarm email actions

Recommended alarm configurations for vSphere 5.5

Recommended alarm configurations for vSphere 6.0

Recommended alarm configurations for vSphere 6.5

These CSV files include the alarms for each version, what I’ve found with customer feedback so far as the best values for each one, a notes field that has any relevant info that may help you decide how to configure each one, and an IfUsed column for each one I’ve set to disabled if you want to enable it what I would recommend.

When to Run PowerCLI Script to Configure vCenter Alarm Email Actions

I designed the PowerCLI script to configure vCenter alarm email actions not just for initial configurations, but also to set any changes that are made.  That’s why the script removes any email alarm actions for Low, Medium, High, and Disabled.  If you wish for the script to not change an alarm’s configuration, leave the Priority column blank for the alarm.

This way, you will have a current documented configuration of alarms that you can simply update and run the script again.p

powercli

Document DRS Rules with PowerCLI

As a consultant, I find myself doing a lot of reconnaissance within customers’ vSphere environments.  Here’s how to document DRS Rules with PowerCLI.

DRS Rules – Why have them?

You can use DRS rules for numerous purposes.  Use them to provide better reliability for applications and services.   You can also use them to ensure licensing compliance, and for other pragmatic purposes.

Here’s a very quick set of hand rules for DRS rules:

  • Use VM anti-affinity DRS rules to ensure redundant VMs are not running on the same host within a cluster.  In these configurations, having only one of the VMs up keeps the service or application online.  Examples include cluster members, web farm members, domain controllers, DNS servers, etc.  (Note: for Microsoft clusters, don’t forget to include File Share Witnesses in the rule as well as the nodes!)
  • Use VM affinity DRS rules for VMs that in total comprise a service and/or application.  In these configurations, downtime for any single VM in the group causes the entire application or service to be inoperable.  Examples include a web front end server, an application middleware server, and database backend server.  Or perhaps an email fax server that depends upon an email server.
  • Use VM host affinity/anti-affinity should DRS rules to try to ensure a VM runs or doesn’t run on specific hosts, but it can violate the rule if required, such as possible preferred nodes are down.  An example would be for vCenter to run on a specific host in case the vCenter VM goes down.  You then likely know which host to log directly into to start it back up.  Otherwise, if that preferred node is down, the VM can run on another node.
  • Use VM to host affinity must DRS rules to ensure a VM will only run on specific VMs, even if that results in downtime if those hosts aren’t up.  Must rules generally should only be used for licensing compliance purposes, where the software vendor licenses the product on all possible potential physical nodes it can run on, not how many hosts it could be actively running on at any given point in time.

Document DRS Rules with PowerCLI – Rules

DRS rules are a little bit of a challenge.  Members are a multi-valued property with VM IDs, which isn’t particularly useful.  We need to work a little magic to translate VM IDs to VM names, and then join the multi-valued property to allow it to be exportable into CSVs, etc.

This can be accomplished using the Get-DrsRule cmdlet.

Get-DrsRule -Cluster ClusterName | Select Name, Enabled, Type, @{Name="VM"; Expression={ $iTemp = @(); $_.VMIds | % { $iTemp += (Get-VM -Id $_).Name }; [string]::Join(";", $iTemp) }}

Now you can tack on an export-csv or what not to it, and it’s readable with useful information us humans would understand.

Note, there is also a specific cmdlet to get DRS to host rules only if that’s what you’re looking for : Get-DrsVMHostRule, but the above gets all DRS rules.

Document DRS Rules with PowerCLI – Groups

DRS rules can also have groups, so it’s important that they’re documented as well.  Members are a multi-valued property, but that’s the only challenge here.  We just need to use a join method to make it readable.

This can be accomplished using the Get-DrsClusterGroup cmdlet.

Get-DrsClusterGroup -Cluster ClusterName | select Name, Cluster, GroupType, @{Name="Member:"; Expression={[string]::Join(";", $_.Member)}}

Now you can tack on an export-csv or what not to it.

useful utilities duct tape

Useful Utilities Are Useful

If you’re an IT pro, no matter if you’re an admin, and engineer, a consultant, a PC technician, you have a toolbox of useful utilities, scripts, and software that you use to fix problems.  As time goes by, some of those tools get used more and more.  Others are used less and less for various reasons.  But what surprises me is how many tools in my toolbox on the surface have less and less use cases, but I still come back to them even when it seems I never would need to again.

Over the last few weeks, I’ve been working with a customer who has had significant turnover from consultants they’ve used.  They are moving off a troubled disparate datacenter environment that had over time developed numerous problems to a more consolidated environment that various SyCom resources including me have built for them that is functioning properly, has updated software and firmware, etc.  Along the way, we’ve run into numerous challenges that you wouldn’t normally anticipate.  Troubleshooting them to fix the problems often would take too much time to fix,.  Finding a duct tape solution was more expedient.

I wanted to give a few examples just to illustrate that having a wide knowledge of utilities out there and experience with them can help you solve problems.

In this case, the task was seemingly simple – move VMs running on a legacy NetApp array and vSphere 5.1 servers to a new(er) cluster running vSphere 5.5.  The clusters were managed by two different vCenter servers.  These clusters were within the same physical datacenter.  They had network connectivity between them.  They did not have access to the same storage arrays.  The customer allowed downtime to move them.  Therefore, the easiest way was a shared nothing cold migration (we’re running 5.1 on the source side, remember).  Simple, right?

Doing It the Textbook Way

I approached this like how any vSphere resource would.  Get the two clusters into the same vCenter instance, shut the VMs down, and migrate them cold.  How many times have you seen that fail?  Me?  Pretty much never.  Well, it wouldn’t work.  I’ll spare you the troubleshooting details, but trust me, doing it the native way wouldn’t work.

At this point, the time had come to get creative and bust out some useful utilities I hadn’t used in a long time.  We had to get the job done.  Tick tick!

Useful Utilities #1 – Veeam FastSCP

The customer wasn’t a Veeam customer (yet).  While the customer could take some downtime off hours, there was a limit to that.  We had to move about 2TB of data, so we needed to move this data as quickly as possible without a ton of labor to reconfigure the networks to get both environments access to the storage.

Sure, I could use WinSCP to just bulk copy the VMs over, but Veeam FastSCP, built into Veeam Backup and Replication trial, is free, and it moves data quicker as it disables encryption on the data transfer, which was acceptable to the customer.  I hadn’t had any reason to use FastSCP in probably five years because cold migration functionality and exporting VMs to OVFs and what not within vSphere made it unnecessary.  But here I was, using it yet again.

And sure enough, it worked like a champ.  We tested a quick procedure using it on a few development workloads.  We then proceeded moving all but the critical VMs, and it worked great… except for the last VM of course.  Come to find out, that was a critical SQL VM that the customer didn’t realize was using physical Raw Device Mappings.

Well, shoot, how do we do this one in a quick manner?

Useful Utilities #2 – VMware Converter

For numerous reasons, including perhaps sheer circumstance of projects I’ve worked on, I hadn’t until this had a need to use VMware Converter in years.  Virtualization is so prevalent now, that P2V is one of those things for me that’s like, “Hey man, remember that time we had to convert like 100 physical machines to virtual back in the day?  Good times!”

Also, I’ve generally recommended to customers to avoid converting physical to virtual anyway.  It should generally be seen as a shortcut, but never optimal.  If you could just build a fresh new VM and get the data moved, the resulting VM would be cleaner.  It would probably perform better.  There’s less chance of instability from old drivers and what would inevitably be a significant change in hardware for the OS and application.  Obviously, if you’re dealing with a ton of machines, rebuilding them all isn’t practical.  In that case, you might have to turn to a P2V tool.

But if you got a VM with physical RDMs, you can’t clone the VM.  You can’t bulk copy the Virtual Machine files over.  You could create new VMDKs and copy everything out of the RDM disks to those and reassign drive letters.  However, this SQL VM was nasty with complex mount points and drive letters assigned.  We had to get it done the weekend the RDMs were discovered.

Solution?  VMware Converter!  I tried installing it on an admin server and set up the job.  That of course failed because of Murphy’s Law.  The Converter agent wouldn’t install due to insufficient permissions.  I installed it directly on the SQL VM (with the same account I tried to push the agent, mind you), stopped the SQL services to ensure the data was static, and ran it.  Other than it shuffling a few drive letters around on the converted VM that a few mouse clicks fixed, it worked like a champ.

How about you?  Any useful utilities you’ve used recently you haven’t used in awhile?

Making life easier using vSphere Tags

One of the least used features in vSphere that I think almost all admins could really make use of but don’t is the ability to create custom vSphere tags within vSphere.

I wanted to take the time to point this feature out, and perhaps give people some ideas on how to make use of of them.  This can help with management and automation quite a bit.

What are vSphere Tags?

vSphere Tags are effectively custom metadata type info that can be applied to objects within vCenter.  You get to make your own to fit your own needs.  They assist basically with locating objects for more efficient administration and management.

They’re unique to other things such as folders for your VMs in that you can assign multiple tags to the same VM or other objects.

Let’s break this down by comparing vSphere tags to MP3 management software like iTunes.  An individual MP3 file must be in one file system folder or another.  It can’t be in both.  But suppose you want to find all your songs by an artist, by genre, or by album?  We intuitively understand this now with MP3s.

But we have the same problem with VMs.  You can organize your VMs into VM folders in vCenter, but a single VM can only be in one folder or another.  What if you wanted to organize your VMs by criticality?  By whether or not they have SQL?  Whether or not they need to be backed up?  Trying to do this with folders would be a nightmare to manage.  Plus, remember a VM folder is the mechanism for assigning permissions, too.  Maybe you don’t want this metadata having any impacts on anyone’s permissions to manage it.

That’s when you use vSphere tags!

Use Cases for vSphere Tags

Use cases for this functionality are numerous:

  • Criticality of VM – this would allow the expedited power up or down of VMs based on this nature.  Running out of resources within your cluster due to sudden host failures?  Power down the non-critical VMs.  It would also be helpful for vSphere Admins who aren’t the application admins to know when to handle a VM with care before doing anything to it.
  • Application groupings – Maybe it doesn’t make sense to put VMs that work together to provide an application or service, but you want to know those groups.  That could allow a SQL server that serves the backend of multiple application groups to be identified for both simultaneously.
  • Presence of a common application like SQL – This can be helpful for locating VMs that may require special settings on backup jobs to quiesce the file system before backing the VM up.  You might also use this to find potential VMs that other VMs are dependent on, so you can set their restart priority so they boot up first in an HA event scenario.
  • Lab/Test VMs – You could set the resource allocation for Lab/Test VMs to low to help ensure they are given less resources than production VMs.

OK, I convinced you (hopefully)!   Let’s make some tags.

Basic Concepts for vSphere Tags You Need To Know

You can create vSphere tags in both within the vSphere Web Client and with PowerCLI.  It’s simple, but you need to know a few concepts.

All vSphere tags belong to a Category.  There are two main types of categories.  This notion is called Cardinality.  It sounds more complicated than it is.  Basically, you can have a category where only a single tag from that category can be applied to any given object.  For example, let’s say you want to tag VMs by criticality.  Logically, a VM will only have one criticality rating, not multiple.  IE, it makes zero since for a VM to be both low and medium as far as how critical they are.

However, sometimes you might want a category that multiple tags could apply to the same object.  For example, let’s say you want to make a category called “Special Applications” to identify very specific apps within a VM to easily identify SQL servers, Domain Controllers, and Exchange servers.  While I wouldn’t recommend it, it’s possible for a single VM to be all three simultaneously.

vSphere tags can apply to all kinds of objects as well, not just VMs.  You can select which objects a tag can be applied to within the category.

Managing vSphere Tags Using the Web Client

To create tags within the vSphere client, navigate to the Tags section of the web client.

vm tags web client nav

You must create a category first if there isn’t one already made.  Click the Categories button, and then click the create categories icon.

For this example, we will make a category for criticality ratings for VMs.  We want one tag per object, not more, and we only want the tag to be applied to VMs or vApps.

vsphere category example

Now that we have our category, we can create tags within it.  Click on Tags, and the new tag icon.  Be sure to select the category during tag creation.

vsphere tags create tag example

Rinse and repeat for all the tags you want to create for the category.  One tip I recommend is to name the tags with incuding their category name, which refers to some kind of concept.  Since you usually search by the tag name, you want for example LowCriticality instead of Low.  (See below for search examples.) Low in and of itself could mean a lot of things.  Low resource usage, low criticality, etc.

To apply a tag to an object, simply right click the object, point to Tags & Custom Attributes > Assign Tag…

vsphere tags assign tag

A new dialog box appears where you can filter categories or see all categories and select the vSphere tags you wish to assign.  Also, notice you can remove tags here, too.

Managing vSphere Tags Using PowerCLI

PowerCLI has full tag management functionality within it, too.

Creating a category:

New-TagCategory -Name VMCriticality -Description "Criticality of the VM" -Cardinality Single -EntityType "VirtualMachine","VApp"

Creating a tag:

New-Tag -Name "LowCriticality" -Description "Non-Critical VMs" -Category VMCriticality

Assigning a tag to a VM:

get-vm Shoretel | New-TagAssignment -Tag "HighCriticality"

You can do lots of things with PowerCLI and tags.

Using vSphere Tags

Now that you have tags created and applied, you can now make use of them to make your life easier.

You can make use of tags in both the vSphere Web Client and via PowerCLI.  To find all VMs with a tag within the vSphere Web Client, simply type the tag value in the search box.  The tag name will automatically populate.

vsphere tags searching

Click on it.  Boom, you got your objects with that tag!

vsphere tags search results

There’s also a parameter on PowerCLI’s Get-VM cmdlet to identify the VMs with that tag.  You can then pipe that to another cmdlet.  Say for example you want to shutdown your non-critical VMs because you suddenly experience multiple host failures, so you need to make sure your more important VMs get the resources they need:

Get-VM –Tag “LowCriticality” | Shutdown-VMGuest

Imagine if you set up vSphere tags to identity all your VMs with SQL.  Imagine you’re setting up Veeam backup jobs, and you need to know which VMs you need to setup special quiescing.  You could easily just get that list of VMs.

That’s how to use vSphere tags!

How do you think you might be able to use them, or how do you use them within your environment?

vSphere 6.5 – New features I can’t wait for!

VMware announced vSphere 6.5 at VMworld Europe.   I don’t want to go through everything that’s new, but I do want to go over the vSphere 6.5 new features I think are the coolest that I can’t wait for.

vSphere 6.5 New Features – Me likey!

Here are the vSphere 6.5 new features I specifically wanted to highlight that I think are going to be the most useful to my customers.

vCenter 6.5 New Features

  • The vCenter Server Appliance (VCSA) FINALLY has an integrated VMware Update Manager.  No more Windows machine for VUM!  Even less excuses for using the Windows version!  Speaking of which…
  • Native VCSA high availability!  In vCenter 6.0, the only way to make vCenter truly highly available was to use Windows Clustering.  Not anymore!  Now the VCSA has its own ability.  VCSA NOW AND FOREVER!
  • File-based backup and recovery for VCSA, so it’s even easier to make any kind of recovery you may need.
  • HTML5 based vSphere Web Client!  Take that, Adobe Flash!  No more Flash vulnerabilities and issues to worry about!
  • Fully supported standalone HTML5 based thick client!

Clustering New Features

  • HA Orchestrated Restarts – Now you can enforce a chain of VMs to ensure VM interdependency for multi-tiered applications!
  • Proactive HA – Now you can integrate HA with hardware vendor monitoring tools to move VMs off hosts that have hardware problems before they actually result in an ESXi host crashing.  How cool is that?
  • DRS now takes network bandwidth into account, to ensure your workloads can be dynamically moved between hosts to ensure the best network performance.

Security New Features

I have numerous customers who for legal and other reasons are extremely security conscious.  These may be of particular interest:

  • vMotion traffic encryption – One of the reasons I recommend segregated isolated non-routable VLANs generally for vMotion traffic is due to the fact that vMotion traffic is unencrypted.  Think about the implications of that.  The running contents of RAM for a VM is copied in the clear over a network during a vMotion!  If that’s a VM processing let’s say credit card transactions or personally identifiable information like a Social Security number, that’s pretty scary!  Now consider the boundaries of vMotions have been lifted to the point you can conceivably vMotion a VM across datacenters.  Now, for the first time, you can encrypt this traffic.
  • VM disk encryption – If your shared storage solution can’t encrypt your data at rest, you used to be out of luck for doing whole VM encryption.  Not anymore!  Now it can be done at the VM level!
  • Better logging now to provide better auditing capability to see who did what within the environment.

There’s a whole lot more in this release.  I’m sure I’ll post more about these and other cool features and capabilities soon!

Before you ask, the tentative release date for vSphere 6.5 is Q4 2016.

Resolving VM MAC Conflict alarm with Veeam Replicas

It’s been awhile since I’ve deployed Veeam using replication with vSphere 6.0.  I recently implemented it for a customer who was replicating VMs to a secondary storage appliance in addition to backing the VMs up to a Data Domain.  Upon running the initial replication for the VM, a “VM MAC Conflict” alarm triggered on the replica VM.

vm mac conflict alarm triggered

Here’s a description of what’s going on and how to prevent the VM MAC Conflict alarm from triggering.

VM MAC Conflict Alarm

The VM MAC Conflict alarm is new to vCenter 6.0 Update 1a.  The intent of the alarm is to warn you if two vNICs on VMs within a vCenter instance have the same MAC address.  This can happen for a variety of reasons:

  1. vCenter malfunctioned and dynamically provided the same MAC address to two or more vNICs.
  2. Either intentionally or mistakenly, an admin or a third party product might have statically assigned a MAC address already in use within the environment.  In this case, Veeam created a copy of the VNX file with identical MAC addresses for the source and replica VM’s vNICs.

It’s a good alarm to have to notify you just in case.  But how do you keep this alarm while stopping it from triggering on replica VMs?

Stopping VM MAC Conflict Alarms from triggering for Veeam Replicas

The solution for preserving the VM MAC Conflict alarm while stopping it from triggering on Veeam replicas is quite simple.  You can modify the alarm itself by setting an exception to exclude VMs.  In the case of Veeam replicas, they have a “_replica” suffix within the VM name by default.  If you changed that suffix in the replica job, just adjust accordingly.

Go to the VM MAC Conflict alarm definition.  It’s in the vCenter inventory object under Manage > Alarm Definitions.  Click the alarm and on the right, click Edit.

Under the bottom box that reads, “The following conditions must be satisfied for the trigger to fire”, add a condition that says the VM name does not end with “_replica”.  Once applied, the alarm disappears for your replica VMs.

vm mac conflict alarm modified

That’s it!