Tag Archives: vsphere

Adventures in SRM 6.0 and MirrorView

Recently, I setup SRM 6.0 with MirrorView storage based replication.  It was quite the adventure.  The environment was using SRM 5.0 and MirrorView, and we upgraded them to vSphere 6.0 and SRM 6.0 recently.  I wanted to get my findings down in case it may help others setting this up.  I found when I ran into issues, it wasn’t easy finding people who were doing this, as many who are using VNXs are using RecoverPoint now instead of MirrorView.

Version Support

First off, you might be wondering why I recently deployed SRM 6.0 instead of 6.1.  That’s an easy question to answer – currently, there is no support for MirrorView with SRM 6.1.  I’m posting this article in 11/2015, so that may change.  Until it does, you’ll need to go with SRM 6.0 if you want to use MirrorView.

Installation of Storage Replication Adapter

I’m assuming you already have installed SRM, and configured the pairings and what not.  At the very least, have SRM installed in both sites before you proceed.

Here’s where things got a little goofy.  First off, downloading the SRA is confusing.  If you go to VMware’s site to download SRA’s, you’ll see two listings for the SRA, with different names, suggesting they work for different arrays, or do something different, or are different components.

mirrorsradownload

They’re actually so far as I can tell two slightly different versions of the SRA.  Why are they both on the site for download?  No idea.  So I went with the newer of the two.

You also need to download and install Navisphere CLI from EMC for the SRA to work.  There are a few gotchas on the install of this to be aware of. Install this first.

During installation, you need to ensure you check the box “Include Navisphere CLI in the system environment path.”

navispherepath

That’s listed in the release notes of the SRA, so that was easy to know.  You also need to select to not store credentials in a security file.

I ended up having issues with the SRA being able to authenticate to the arrays when I originally told it to store credentials thinking this could allow easier manual use of Navisphere CLI should the need arise, but that messed things up, so uninstalled, and reinstalled Navisphere CLI without that option, and the bad authentication messages went away.

Next, install the SRA, which is straight forward.  After the installation of the SRA, you must reboot the SRM servers, or they will not detect that they have SRA’s installed.  That takes care of the SRAs.

Configuring the SRAs

Once you have installed the SRA’s, it’s time to configure the array pairs.  First, go into Site Recovery within the vSphere Web Client, and click Array Based Replication.

arraybasedreplication

Next, click Add Array Manager.

addarraymanager

Assuming you’re adding arrays from two sites, click “Add a pair of array managers”.

addarraypairs

Select the SRM Site location pair for the two arrays.

sralocationpair

Select the SRA type of EMC VNX SRA.

selectsratype

Enter the Display name, the management IPs of the array, filters for the mirrors or consistency groups if you are using MirrorView for multiple applications, and the username and password info for the array for each site.  Be sure to enter the correct array info for the indicated site.

sraarrayinfo

I always create a dedicated SRM service account within the array, so it’s easy to audit when SRM initiates actions on the storage array.

You’ll need to fill the information out for each site’s array.

Keep the array pair checked and click next.

enablearraypairs

Review the summary of action and click finish.

At this point, you can check the array in each site and see if it is aware of your mirrors being replicated.

checksrareplicationinfo

So far so good!  At this point, you should be able to create your protection groups and recovery plans, and start performing tests of a test VM and recoveries as well.

Problems

I began testing a test Consistency Group within MirrorView, which contained one LUN, which stored a test VM.  Test mode worked immediately to DR.  Failover to the DR site failed, as it often does in my experience with most Storage Based Replication deployments.  No problem, I simply launch it again, and it works, and it did in this case.

With the VM then in the DR site, I performed an isolated test back to production, which worked flawlessly.  It’s when I tried to fail back to production I encountered a serious problem.  SRM reported that the LUN could not be promoted.  Within SRM, I was given only the option to try failover again.  The icon was grayed out to do cleanup or a test.  Relaunching failover resulted in the same result.  I tried rebooting both SRM servers, vCenter, running rediscovery of the SRAs, you name it.  I was stuck.

I decided to just manually clean up everything myself.  I promoted the mirror in the production site, had hosts in both sites rescan for storage.  The LUN became unavailable in the DR site, but in production, while the LUN was visible in terms of seeing an available LUN, the datastore wouldn’t mount.  Rebooting the ESXi server didn’t help.  I finally added it as a datastore, selecting not to resignature the datastore.  The datastore mounted, but I found that the datastore wouldn’t mount after a host reboot.  Furthermore, SRM was reporting the MirrorView consistency group was stuck failing over, showing Failover in Progress.  I tried recreating the SRM protection group, re-adding the array pairs, and more, but nothing worked.

After messing with it for awhile, checking MirrorView and the VNX, VMware, etc., I gave up and contacted EMC support, who promptly had me call VMware support, who referred me back to EMC again because it was clearly an SRA problem for EMC.

With EMC’s help, I was able to cleanup the mess SRM/SRA made.

  1. The Failover in Progress reported by the SRA was due to description fields on the MirrorView description view.  Clearing those and rescanning the SRAs fixed that problem.
  2. The test LUN not mounting was due to me not selecting to resignature the VMFS datastore when I added it back in.

At this point, we were back to square one, and I went through the gambit of tests. I got errors because the SRM placeholders were reporting as invalid.  Going to the Protection Group within SRM and issuing the command to recreate the SRM placeholders fixed this issue.

We repeated testing again.  This time, everything worked, even failback.  Why did it fail before?  Even EMC support had no answer.  I suspect it’s because anytime I make the first attempt in a direction in an SRM environment to failover, it always fails.  Unfortunately, it was very difficult to fix this time.

Change VMware MPIO policy via PowerCLI

This is one of those one liners I think that I’ll never use again, but once again, I found myself using it to fix MPIO policies on a vSphere 5.0 environment plugging into a Nexsan storage array.  I’ve previously used it on EMC and LeftHand when the default MPIO policy for the array type at the time of ESXi installation is not the recommended one after the fact, or in the case of LeftHand, it is wrong from the get go.

get-vmhost | get-scsilun | where vendor -eq "NEXSAN" | set-scsilun  -MultipathPolicy "RoundRobin"

In this case, it was over 300 LUN objects (LUNs x hosts accessing them), so that’s about 5 mouse clicks per object to fix via GUI.  Translation, you REALLY want to use some kind of scripting to do this, and PowerCLI can do it in one line.

You gotta love PowerShell!

Disable CBT on Veeam jobs via PowerShell

If you haven’t heard the not so great news, VMware has discovered a bug  in vSphere 6 with Change Block Tracking (CBT) that can cause your backups to be corrupt and therefore invalid.  Currently, they are recommending not to use CBT with vSphere 6 when backing up your VMs.

I was looking for an easy way to disable this on all jobs in Veeam quickly via PowerShell, but it’s not obvious how to do that, so I took some time to figure it out.  Here it is assuming the module is loaded in your PowerShell session.

$backupjobs = get-vbrjob | where jobtype -eq "Backup"
foreach ($job in $backupjobs){
$joboptions = $job | get-vbrjoboptions
$joboptions.visourceoptions.UseChangeTracking = $false
$job | set-vbrjoboptions -options $joboptions
}

Here’s now to enable it again:

$backupjobs = get-vbrjob | where jobtype -eq "Backup"
foreach ($job in $backupjobs){
$joboptions = $job | get-vbrjoboptions
$joboptions.visourceoptions.UseChangeTracking = $true
#$joboptions.visourceoptions.EnableChangeTracking = $true
$job | set-vbrjoboptions -options $joboptions
}

Sorry it’s not pretty on the page, but I wanted to get this out ASAP to help anyone needing to do this quickly and effectively.

One thing to note is in the enable script, there’s a commented line out.  If you have already set your jobs manually and wish to use the script to enable CBT again, be aware that the option to enable CBT within VMware if it is turned off gets disabled if you turn CBT off altogether within the job setup.  If you disable CBT with my script, that doesn’t get touched, so you don’t need to remove the # on that line.   If you want that option enabled again, take out the # before that line, and it’ll enable that option again.

Hope this helps!

Clarifying vSphere Fault Tolerance

I hear a lot of confusion about some of the new enhancements of vSphere 6. One is specifically Fault Tolerance (FT)

In case you do not know what FT is, this is a feature that basically (was supposed to) fit the need for a handful of your most critical VMs that High Availability (HA) didn’t protect well enough.  HA restarts a VM if the ESXi physical host it was running on failed on another host, or if you enable VM Monitoring, a VM that blue screened or locked up.  Note the VM would be down during the restart time of the VM and the boot up of the OS within the VM.  FT effectively runs a second copy of the VM in lockstep on another host, so should the host the live VM runs on fails, the second copy immediately takes over on the other host, with no downtime.

Please note that vSphere 6 nor previous versions of vSphere do not protect against an application crash itself unless the application crashed due to a hardware failure using Fault Tolerance.  It only protects against effectively failures pertaining to hardware, like a host failure.  There is no change there.  If you want protection from application failures, you still should look at application clustering and high availability solutions, like Exchange DAGs, network load balancing, SQL clustering, etc.  On the flip side, I have personally seen many environments actually have MORE downtime because of application clustering solutions, especially when customers don’t know how to manage them properly, but FT is a breeze to manage.

The problem with FT in the past is it had so many limitations.  The disks had to be zero eager thick provisioned for the VM, you could not VMotion the VM or the second copy, and more, but the biggest limitation was the VM could only have 1 vCPU.  If you’re thinking how many critical apps only need 1 vCPU, the answer is pretty much zero.  Almost all need more, so FT became the coolest VMaware feature nobody used.

That changes in vSphere 6.  You can use FT to protect VMs with up to 4 vCPUs.  They can be thin or thick provisioned.

FT protected VMs can now be backed up with whole VM backup products that utilize the VMware Backup APIs, which is all of them that backup whole VMs.  Veeam, VMware Data Protection, etc.  This is a pretty big deal!

You can hot configure FT for a VM on the fly now without disrupting the VM if it is currently running, which is yet also really cool.  Maybe you got a MS two node cluster, and one gets corrupted.  Enable FT on the remaining one to provide extra protection until the second node is rebuilt!

Also, the architecture changed.  This is good and bad.  In the past, FT required all the VMs disks to be on shared storage, and the active and passive VMs used the same Virtual disk files, VM Config files, etc.  This is no longer the case.  Now the storage is replicated as well, and it can be to the same Datastore or different datastores.   Those datastores can be on completely different storage arrays if you want.  On the downside, you need twice the storage for FT protected VMs than you did before, but the good news is a storage failure may not take out both data sets and kill the VM, too!

In my opinion, these changes have finally made FT definitely something that should be considered and will be implemented far more commonly.

So while a lot of the restrictions were lifted, there are still some left, notably:

  • Limit of 4 vCPUs, 64GBs of RAM for a FT protected VM.
  • Far more hardware is supported, but you still need hardware that is officially supported.
  • VM and the FT copy MUST be stored on VMFS volumes.  No NFS, VSAN, or VVOL stored VMs!
  • You cannot replicate the VM using vSphere Replication for a DR solution.
  • No storage DRS support for FT protected VMs
  • 10gb networking is highly recommended.  This is the first resource that runs out when protecting VMs with FT.  So if you were thinking FT with the storage replication would be a good DR solution across sites, uhh, no.
  • Only 4 FT active or passive copies per host.

So, if you’re thinking about a vSphere solution for a customer, and you pretty much dismissed FT, consider it now.  And if you support environments with VMware, get ready to see more FT as vSphere 6 gets adopted!

vSphere 6 NFS 4.1 Gotchas

VMware has added some additional NFS features to v6.  I knew it supported NFS 4.1 as well as 3, but there are some significant ramifications related to this and some gotchas.

  1. vSphere does not support parallel NFS (pNFS)! 
  2. It DOES support multipathing if connecting with NFS 4.1.  What you do is add the multiple NFS target IPs when setting up your NFS mount.
  3. IMPORTANT: This is the biggest gotcha, and something we all need to be aware of.  If your system supports both 4.1 and 3 simultaneously for a mount, you MUST use one or the other for an export used by VMware!  V3 and v4.1 use different locking mechanisms.  Having both enabled for an mount simultaneously and then having different ESXi hosts mounting it with differing versions can corrupt data!!!  Best practice is enable only one of the two protocols for that export, never both.
  4. You can authenticate using Kerberos, which is more secure.

NFS 4.1 support though isn’t all roses.  Here’s what you can’t do…

  • No Storage DRS
  • No SIOC
  • No SRM
  • No vVol
  • No VAAI (bigger deal now that Standard licensing includes VAAI)
  • Must run vSphere Replication 6.1 or higher to use VR to replicate VMs on NFS 4.1 mounts.

VPLEX Failure Scenarios

I recently setup a VMware Storage Metro Cluster using EMC’s VPLEX with synchronous storage replication. I wanted to put out there a description that’s hopefully easy to understand the failover logic.

VPLEX has volumes that are synchronously replicated, presented to VMware ESXi hosts that are in a single cluster, half of which are in SiteA, the other half in SiteB, and there’s a VPLEX witness in SiteC.

There are a couple of concepts to get out of the way.  First off, VPLEX systems in this configuration have to be able to connect to each other over two different subnets – management and storage replication.  The Witness has to be able to connect to both VPLEX’s via their management network.  You should be taking into consideration the fact that these network connections are EXTREMELY important and do everything you reasonably can to avoid VPLEX’s especially from becoming network isolated from each other.  Bad things will often happen if they do.  Notice that EMC requires quite a bit of network redundancy.

VPLEX synchronously replicates storage within Consistency Groups, which contain one or more LUNs. For the rest of this explanation, I may say LUN, so assume we’re talking a consistency group that just has one LUN.

VPLEX consistency groups also contain two settings.  One dictates which is the preferred site.  This basically means that under various scenarios, the identified site’s VPLEX will become the sole active copy of the LUN.  That value can be one of the VPLEX’s, or nothing at all.  There’s also a setting within a consistency group that basically states whether or not the Witness should be used to determine the proper site that a LUN should be placed in under various scenarios.

For those of you who aren’t familiar with VPLEX, the witness is either a virtual appliance or physical server.  It is optional but highly recommended.  It must be deployed in a third site if it will be deployed, and it must have connectivity to both VPLEX management links, and those links must be completely independent from each other.

Failure scenarios work very similarly to majority node set clusters, such as MS Clustering. Most of this works in the manner you’d probably guess if you’ve ever dealt with high availability solutions, especially those that cross site links, such as an Exchange 2010 DAG.  It’s pretty much a majority node set/majority node set + witness type logic.  I want to focus on specific scenarios that has very significant design implications when it comes to vSphere in a Storage Metro Cluster scenario, and how VPLEX avoids split brain scenarios when network links go down.

The chief concept to remember in all this is VPLEX must always always ALWAYS make sure a LUN doesn’t get active in both VPLEX sites simultaneously should they not be able to talk to each other.  If that happens, data for a single LUN would be inconsistent, both potentially with data that can’t be lost, but no real way to sync them up anymore.  Under normal operations, VPLEX would allow both sites to actively write data into them, but the minute a VPLEX in a site goes down or gets disconnected, it must be ensured that ONLY one of them has an active copy of the LUNs.  The absolute worst thing that could ever happen, even worse than prolonged downtime, is there can’t be two disparate copies of the same LUN.

Scenario 1:  What happens if the synchronous storage replication link between the two sites for VPLEX goes down?  What if total connectivity only between the two VPLEX sites is lost?

The problem here is that VPLEX can’t synchronously write data to both copies of the LUN in each site anymore.  LUNs therefore must become active SiteA, SiteB, or worst case, neither.

How does this work?  It depends on what site preference is set on the consistency group.  It doesn’t really matter whether the option is set to use the witness or not or if a witness is even present.  If no site preference has been identified for the consistency group, the LUNs must go offline in both sites because there’s no way to determine the right site in this situation.  If a site preference is defined, LUNs would become active in their preferred sites only.  The existence of the witness here is irrelevant because both VPLEX’s can still talk to each other via their management link.

There’s a VMware implication here – you should note probably in the datastore name somehow which is the preferred failover site, and then make sure you make VM-to-host should rules that encourage VMs placed in datastores that map to LUNs with a preference for SiteA VPLEX should failures occur to run on SiteA ESXi hosts.  This eliminates HA events caused by connectivity problems between the VPLEX’s, specifically the synchronous storage link.

Scenario 2: What happens if the management connectivity between the VPLEX’s goes down?

Everything continues to work because VPLEX can communicate via the storage replication link.  Both VPLEX’s Keep calm and write I/O on in both sites, just like it was.  The presence and options concerning the witness are irrelevant.

Scenario 3: What happens if there’s a total loss of connectivity between the two VPLEX sites, both management and storage replication, but both sites can communicate to a witness if there is one?

In this scenario, basically, the outcome is one of two things:  If the LUN has a preferred site identified, it becomes active on that site only.  If it doesn’t, it goes offline in both sites.  The witness, regardless if the option relevant to whether it factors into the decision on what to do is enabled, serves as a communication mechanism to both sites to let them known this is the scenario.  Otherwise, the two VPLEX systems wouldn’t know this happened vs the other had actually failed.

Scenario 4: What happens if a VPLEX failed in one site?

Depends on if the witness option on the VPLEX consistency group is enabled (and of course if you deployed a witness).  If it is enabled, the LUN fails over to the second site.  If the option isn’t enabled, it depends if the preferred site is the one that failed.  If it did, LUN goes offline.  If the non-preferred site failed, the LUN remains active in the preferred site.  You should see the value now of a witness.  Usually, having a witness and enabling this option is a good thing.  But not always!

Scenario 5: What happens if all sites stay up, but network connectivity fails complete between all of them?

Depends on if the option to use the witness is turned on or not.  If it’s off, the LUN becomes active in its preferred site, and becomes inaccessible in the other.  If the witness option is turned on in the consistency group, then there’s no way for each site to know if the other sites failed, or only it got isolated.  Therefore, nobody knows if the LUN has become active anywhere else, so the only way to avoid a split brain is make the LUN unavailable in ALL sites.

There’s a design implication here – if a workload should stay up in a preferred site in any situation, even network isolation, at the cost it may be down if its site goes down, you should place the VM on datastores with a preference for the correct site, and DO NOT enable the consistency group to use the witness.

One last design implication with VPLEX – I see limited use of not identifying a preferred site.  I see even less use of having a consistency group set without a preferred site AND not to use a witness when needed.  You’re just asking for more instances in both cases of a LUN taken offline in every site.  To be honest, I think almost always, a witness should be deployed, consistency groups should be a set with a preferred site for failure scenarios, and the witness use option should be enabled.

There you have it!

vSphere 6 – Certificate Management Intro

I like VMware and their core products like vCenter, ESXi, etc.  Personally, one thing I really admire is the general quality of these products, how reliable they are, how well they work, and how VMware works to address pain points of them to make them extremely usable.  They just work.

However, certificate management has been a big pain point of the core vSphere product line.  There’s just no way around it.  And certificates are important.  You want to ensure the systems you’re connecting to when you manage them are those systems.  For many customers I’ve worked with, because of the pain of certificate management within vSphere, the fact that some customers are too small and don’t have an on premise Certificate Authority, and to ensure the product continues to work, they often don’t replace the default self-signed certificates generated by vSphere.

That’s obviously less than ideal.  The good news is certificate management has been completely revamped in vSphere 6.  It’s far easier to replace certificates if you like, and you have some flexibility as to how you go about this.

Three Models of Certificate Management

Now, you have several choices for managing vSphere certificates. This post will outline them.  Later, I’ll show you how you can implement each model.  Much of this information comes from a VMworld session I attended called “Certificate Management for Mere Mortals.”  If you have access to the session video, I would highly encourage viewing it!

Before we get into the models, be aware that certificates can basically fall under one of two categories – certificates that facilitate client connections from users and admins, and certificates that allow different product components to interact.  Also, vCenter also has built in Certificate Authority functionality within it.  That’s a bit obvious since you already had self-signed certificates, but this functionality has been expanded.  For example, you can allow vCenter to act as a subordinate authority of your enterprise PKI, too!

Effectively, this means you have some questions up front you want to answer:

  1. Are you cool with vCenter acting as a certificate authority at all?  The biggest reason to use vCenter is it is easier to manage certificates this way, but your security guidelines may not allow it.
  2. Are you cool with vCenter being a root certificate authority should you be cool with it generating certificates?  If not, you could make it a subordinate CA.
  3. For each certificate, which certificate authority should generate them?  Maybe your security requirement that the internal PKI must be used is only for certificates viewable on client connections as an example.

From these questions, typically a few models emerge for certificate management.  You effectively have four models that emerge, which is a combination of your vCenter acting as a certificate authority or not, and which certificates it will generate.

Model 1: Let vCenter do it all!

This model is pretty straight forward.  vCenter will act as a certificate authority for your vSphere environment, and it will generate all the certificates for all the things!  This can be attractive for several reasons.

  1. It’s by far the easiest to implement.  It will generate all your certificates for you pretty much, and install them.
  2. It’ll definitely work.  No worries about generating the wrong certificate.
  3. If you don’t have an internal CA, you’re covered!  vCenter is now your PKI for vSphere.  Sweet!  You can even export vCenter’s root CA certificate, and import it into your clients using Active Directory Group Policy, or other technologies to get client machines to automatically trust these certificates!  Note that it is unsupported for vCenter to generate certificates for anything other than vSphere components.

Model 2: Let vCenter do it all as a subordinate CA to your enternal PKI

Very similar model to the above.  The only exception is instead of vCenter being the root CA, you make vCenter become a subordinate CA for your enterprise PKI.  This allows your vCenter server to more easily generate certificates that are trusted automatically by client machines.  Yet it also ensures that certificates are still easily generated and installed properly.

However, it is a bit more involved than the first model, since you must create a certificate request (CSR) in vCenter to submit to your enterprise PKI, and then install the issued certificate within vCenter manually.

Model 3: Make your enterprise PKI issue all the certificates

Arguably the most secure if your enteprise PKI is secured, this model is pretty self-explanatory.  You don’t make use of any of the certificate functionality within vCenter.  Instead, you must manually generate all certificate requests for all vCenter components, ESXi servers, etc., submit them to your enterprise PKI, and install all the resulting certificates for each yourself.

While this could be the most secure way to go about certificate management, it is by far the most laborious solution to implement, and it is the solution that is most likely to be problematic.  You have to ensure your PKI is configured to issue the correct certificate type and properties, you have to install the right certificates on the right components, etc.  It’s all pretty much on you to get everything right!

Model 4: Mix and match!  (SAY WHAT?!?!?)

When I first heard this being discussed in the session, my immediate reaction by my security inner conscious was, “This sounds like a REALLY bad idea!!!”

But as I listened, it actually makes quite a bit of sense when done properly.  You can mix and match which certificates are and are not generated by the PKI components within vCenter.  However, the model that makes sense if you go hybrid (a hybrid solution doesn’t make sense for everyone!) would be to allow vCenter to manage the certificate generation for all certificates that facilitate vSphere component communication, but use either Model 1, 2, or 3 for all other certificates that facilitate client connections.  Should this meet your security requirement, it meets the best of both worlds – certificates issued by your internal PKI that your clients automatically trust and thereby (potentially) more secure, but ease of management and better reliability for all the certificates that clients don’t see for internal vSphere components.

Which should you go with?

I hate using the universal consultant answer, but I have to.  It depends.  If you don’t have an internal PKI, go with Model 1.

If you have an internal PKI just because you had to for something else, and you want easy trusting of vSphere connections by your clients, go with model 1 and import vCenter’s root CA into your client machines, OR go with Model 2.  Which one in this case?  If you don’t consider yourself really good at PKI management, or if you don’t need many machines to be able to connect to vSphere components, probably Model 1.  The more clients that need to connect, the more it might lean you towards Model 2.

Do you have security requirements that prevent you from using vCenter’s PKI capabilities altogether?  You have no choice, go with Model 3.

I would generally try though for people who think they need to go with Model 3 to look at Model 4’s hybrid approach.  Unless you absolutely have to go with Model 3, go Model 4.

Hope this helps!

Getting Cisco UCS drivers right with vSphere

I’ve noticed one pain point with Cisco UCS – drivers. You better have the EXACT version Cisco wants for your specific environment. But how do you know which drivers to get, how do you get them, how do you know when you need to upgrade them, and how do you know what drivers you have installed? These are all not necessarily straightforward, and getting the info you need can be a real pain.  This post will show how to accomplish this within vSphere.  For Windows servers, please see my follow-up post due out in a few days.

Why is getting the drivers so important?

I want to emphasize that getting the exact right version of Cisco UCS drivers is a big deal! I’ve personally now seen two environments that had issues because the drivers were not exactly correct. The best part is the issues never turned up during a testing of the environment. Just weird intermittent issues like bad performance, or VMs needing consolidation after backups, or a VM hangs out of nowhere a week or two down the road. Make sure you get the drivers exactly right!

How do I install ESXi on Cisco UCS?

First off, pretty much everyone knows that when you’re installing ESXi on Cisco, HP, Dell, IBM, or other vendor servers, use the vendor’s media. That’s common practice I hope by now. In most but not all cases, you get the drivers you need for an initial deployment from the get go, you get hardware health info within VMware, sometimes management and monitoring tasks for out of band management cards, and you ensure vendor support by doing this. We all know I think by now to do initial ESXi installs with vendor media, in this case Cisco. It’s important for Cisco UCS since so many installs require boot from SAN, that you gotta have those drivers within the media off the bat.

Now, if you think you’re done if you’ve downloaded the latest Cisco co-branded ESXi media for an initial deployment, you’re wrong (see below). Also, don’t assume that just because you use the co-branded media to install ESXi on a UCS server, you never need driver updates. You will likely when you update UCS Manager and/or update ESXi down the road.

How do I know which drivers should be installed?

This is relatively simple. First, collect some info about your Cisco UCS environment. You need to know these (don’t worry, if you don’t know what info you need, Cisco’s Interoperability page will walk you through it):
1. Standalone C-Series not managed by UCSM or UCSM managed B and/or C-Series? For those of you who don’t know, if you got blades, you got UCSM.
2. If UCSM is present, which version is it running? Ex. 2.2(3c)
3. Which server model(s) are present? Ex. B200-M3. Also note the processor type (ex. Xeon E5-2600-v2). They can get picky about that.
4. What OS and major version? Note the Update number. Ex. ESXi 5.5 Update 2
5. What type and model of I/O cards do you have in your servers? Example – CNA, model VIC-1240

Then head on over to the Interoperability Matrix site.  Fill in your info, and you get a clear version of the driver and firmware.

ucsdriverlookup

It’s very straightforward to know which drivers are needed from that.

How do I figure out which drivers are installed?

If you go looking at Cisco for how to find that out, you get treated to esxcli commands.  Do you really want to enable SSH on all your hosts, SSH into each host, run some commands, then have to disable SSH on all those boxes when you’re done, and not have an easy way to document what they are?  Nope!

BEHOLD! POWERCLI!

To get the fnic driver versions for all ESXi hosts:

$hosts = Get-VMHost
$versions = @{}
Foreach($vihost in $hosts){
$esxcli = Get-VMHost $vihost | Get-EsxCli
$versions.Add($vihost, ($esxcli.system.module.get("fnic") |
Select Version))
}
$versions.GetEnumerator() | Sort Name | Format-List 

You get this:

Name : esxi01.vspheredomain.local
Value : @{Version=Version 1.6.0.12, Build: 1331820, Interface: 9.2 Built on: Jun 12 2014}

Hey! That’s the wrong driver, even though I used the latest co-branded media! SON OF A…!

Let’s get some enic driver versions…

$hosts = Get-VMHost
$versions = @{}
Foreach($vihost in $hosts){
$esxcli = Get-VMHost $vihost | Get-EsxCli
$versions.Add($vihost, ($esxcli.system.module.get("enic") |
Select Version))
}
$versions.GetEnumerator() | Sort-Object Name | Format-List 

You get:

Name : esxi01.vspheredomain.local
Value : @{Version=Version 2.1.2.59, Build: 1331820, Interface: 9.2 Built on: Aug 5 2014}

Of course, Cisco apparently didn’t update those drivers in their co-branded media either.

Note for both scripts, you will get errors about get-esxcli not being supported without being connected directly to each host. It works for our purposes.

How do I update Cisco UCS drivers?

Now we know, despite using the latest Cisco co-branded media in my implementation, I need some driver updates. If you go to Cisco’s site for how to install these drivers, they’ll tell you to upload the package to each host and install them one at a time manually using esxcli commands. Do you really want to do that?

Let’s be smart/lazy/efficient and use VMware Update Manager. That way if a new host gets introduced, VUM will report that host non-compliant, and it’ll be easy to fix that one, too. And it’s easy to see which hosts do and don’t have those drivers down the road.

I find if I google the driver version, I find a download from VMware’s site with that exact version first or second link. Here’s our fnic driver and enic driver in this case.

Download those to your vCenter server or something with the vSphere thick client. Unzip them into their own folders. Open up a thick vSphere client connection to vCenter (Web Client won’t allow you to do this), click Home, then click Update Manager.

Next, click Patch Repository tab at the top, and then click Import Patches in the top right.

vumimportpatches

When the dialogue comes up, browse to select the zip file that was *contained* in the original zip file. If you select just the zip file you downloaded itself, it will fail. Repeat for the fnic and enic drivers.

When you’re finished, you can then build a baseline that includes the updated drivers. Click Baselines and Groups, then Create above the baselines pane.

vumcreatebaseline

Call it something like “Cisco UCS Current Drivers”.  Select “Host Extension” as a Host Baseline type.  In the following pain, find the drivers and click the down arrow to add them into the baseline.  Note the Patch ID field has driver version specifics, useful if you’ve already got some Cisco drivers imported before.

vumselectpatches

You can then attach that baseline directly to the appropriate object(s) within the host and clusters view, or I like to make a Baseline Group called “Critical and non-critical patches with Cisco updated drivers”, add all the appropriate baselines to that group, and attach that group to the appropriate objects in the Hosts and Clusters view.

Then remediate your hosts. When new drivers come out, import them in, then edit the Cisco baseline, swapping out the last updated drivers with the new ones, and remediate to push them out.

Done!

When should I check my drivers?

You should do this during any of the following:

• During initial deployments
• When UCS Manager or a standalone C-Series BIOS is updated
• Major ESXi version upgrades
• Update pack upgrades for ESXi (when ESXi 5.5 servers for example are to be upgraded to Update 2, or 3, etc)

Also, remember, newer drivers aren’t the right drivers necessarily. Check the matrix for what the customer is or will be running to see which drivers should go along with it!

HP NC375T NICs are drunk, should go home

I ran into one of the most bizarre issues I’ve ever encountered in my decade of experience with VMware this past week.

I was conducting a health check of a customer’s vSphere 5.5 environment, and found that the servers were deployed with 8 NICs, but only 4 were wired up.  While the customer was running FC for storage, 4 NICs isn’t enough redundantly segregate VMotion, VM, and Management traffic, and the customer was complaining about VM performance issues when VMotioning VMs around.  The plan was to wire up the extra add-on NIC ports, and take a port each from the quadport onboard NIC and the add-on HP NC375T.

So first, I looked to see if I had the right driver and firmware installed for this NIC according to VMware’s compatibility list guide.  The driver was good, but commands to determine the firmware wouldn’t provide any info.  Also curious was the fact that this NIC was showing up as additional ports for the onboard Broadcom NIC.  FYI, this server is an HP DL380 Gen7, a bit older but still supported server for VMware vSphere 5.5.

At this point, I wanted to see if the onboard NIC would function, so I went to add the NICs into a new vSwitch.  Interestingly enough, the NICs did not show up as available NICs to add.  However, if I plugged the NICs in and just looked at the Network Adapters info, the NICs showed up there and even reported their connection state accurately.  I tried rebooting the server, same result.  One other server was identical, so I tried the same on that one, same exact behavior – they reported as ports that were part of the onboard NIC, commands to list the firmware version did not work, you could not add them into any vSwitch, but the connection status info reported accurately under the Network Adapters section of the vSphere console.

At this point, I was partly intrigued and enraged, because accomplishing this network reconfiguration shouldn’t be a big deal.  I put the original host I was working on in maintenance mode, evacuated all the VMs, and powered it off.  I reseated the card, powered it back on, and I got the same exact results.  I powered it off, removed the add-on NIC, and powered it back on, expecting to see the NIC ports gone, and they were, along with the first two onboard NIC ports!

This was, and still is, utterly baffling to me.  I did some more research, thinking this HP NC375T must be a Broadcom NIC since it’s messing with the onboard Broadcom adapter in mysterious ways, but nope!  It’s a rebadged Qlogic!  I reboot it, same result.  Cold boot it, same result.  I put the NIC back in, and the add-on NIC ports AND the two onboard NICs come back, all listed as part of the onboard Broadcom NIC!

I researched the NC375T for probably over an hour at this point, finding people having other weird problems, some of them fixed by firmware upgrades.  It took 45 minutes to actually find a spot on HP’s site to download drivers and firmware, but the firmware VMware and everyone else who had issues with this card swore you better be running to have any prayer of stability was not available.  I tried their FTP site, I tried Qlogic’s site, no dice.  I recommended to the customer that we should probably replace these cards since they’re poorly supported, and people were having so many problems, AND we were seeing the most absolutely bizarre behavior I’ve ever seen with a NIC.  The customer agreed, but we needed to get this host back to working again with the four NICs until we could get the replacement NIC cards.

At this point, I had a purely instinctual voice out of nowhere come in to my head and say, “You should pull the NIC out and reset the BIOS to defaults.”  To which, I replied, “Thanks weird oddly technically knowledgeable voice.”

And sure enough, it worked.  All onboard NIC ports were visible again.  Weird!  Just for fun, I stuck the NC375T back in.  What do you know, it was now listed as it’s own separate NIC, not a part of the onboard Broadcom adapter, AND I could add it to a vSwitch if I wanted, AND I could run commands to get the firmware version, which confirmed it was nowhere near the supported version for vSphere 5.5.

In the end, the customer still wanted these NICs replaced, which I was totally onboard with at this point, too, for many obvious reasons.

So, in conclusion, HP NC375T adapters are drunk, and should go home!

VMware dedicated swapfile datastores

Dedicated swapfile datastores in VMware are often overlooked.   Here’s why you might use them, and how to size them easily with PowerCLI.

It’s very often advisable to create dedicated swapfile datastores in your VMware vSphere environment.   There are numerous benefits:

  • Ensure there’s room to start a VM
  • Use different storage type than what the working directory uses for performance or cost savings
  • Reduce replication traffic when using storage based replication, because there’s no reason to replicate this storage
  • You may want to snapshot storage that runs VMs for easy recoverability, but there’s no reason to snapshot swapfile

If you decide to create dedicated datastores, you want to use the following principles:

  • Create datastores that are resilient, so that VMs can be started
  • Have hosts that frequently have VMs VMotion between them, such as a cluster, use the same datastores to reduce vMotion network traffic
  • Carefully monitor their space, and size them correctly, and allow for some overhead for growth.

The swapfile size for each VM is determined by the following:

  • The VM’s defined RAM minus the RAM reservation for that VM.

For example, if a VM is defined as having 8GBs, but the reservation for RAM is set for 2GBs, a 6GB swapfile will be created.  By default, a VM has no reservation for RAM.

That means that this datastore space consumption can fluctuate as VMs are built, powered off and on, whenever RAM is added or removed from a VMs definintion, or if its memory reservation is adjusted.

This begs the question – How do you easily size for these datastores easily?  Harnass PowerShell by using PowerCLI!  Simply tune the $vms variable portion or what’s piping to it of the following to grab the VMs that will likely VMotion between the same hosts.  This would usually be by cluster.

$vms = get-cluster clustername | get-vm
$RAMDef = $vms | Measure-Object -Sum memoryGB | select-object sum -expand sum
$RAMResSum = $vms | get-VMResourceConfiguration | measure-object -sum memreservationGB | select-object sum -expand sum
$SwapDatastore = $RamDef - $RamResSum
Write-Host "Defined amount of RAM within VMs is $RAMDef GBs"
Write-Host "Memory reservation for VMs is $RamResSum GBs"
Write-Host "A datastore of at least $SwapDatastore GBs will be needed, plus overhead."

Output will look like this:

Defined amount of RAM within VMs is 218 GBs
Memory reservation for VMs is 0 GBs
A datastore of at least 218 GBs will be needed, plus overhead.

For overhead, you want to keep at least 25% free probably minimum just to keep datastore free space alarms from going off, plus any additional growth from the factors outlined above, mostly centered around new VMs being built.

Many customers balk when told how big the swapfile datastore will be, but you have to remember if you’re changing this within a customer’s environment, they’re going to gain back swapfile space within their existing datastores as swapfiles get placed on the dedicated datastore.

Also, think of the potential storage space savings you could get if you are storage snapshotting your VM datastores, and replicating, plus the bandwidth savings.  Let’s say you have VMs that in aggregate are defined with 500GBs of RAM with no memory reservation.  If you’re doing both snapshots and replication and didn’t dedicate a datastore to the swapfiles, you’re talking savings of 500GBs of replication space, and up to 1TB worth of space savings alone depending upon how much additional space the swapfiles are taking within your storage snapshots.  Pretty worth it!

How do you migrate existing swapfiles?

  1. First, set your cluster to use the host’s swapfile setting instead of the cluster’s.
  2. Set all your hosts to use the same datastore.

To do this in PowerCLI:

$cluster = "clustername"
$swapfiledatastore = "swapfiledatastorename"
get-cluster $cluster | set-cluster -VMSwapfilePolicy InHostDataStore

You’ll have to manually set the host’s cluster datastore with the web or thick client.  PowerCLI fails to set the heartbeat datastore if the host is in a cluster unfortunately.

You should see the swapfiles deleted from the VMs’ working directories and created in the new datastore as VMs are power cycled.