Tag Archives: howto

vCenter 6 Reconfigure from Embedded to External PSC

There have been some problems with embedded PSC configurations, so I’ve had requests to move away from the embedded PSC (PSC and vCenter in same OS instance) to external configurations.  Thankfully, vCenter Update 1 and above has a method to do just this!

Transitioning to External PSC

To accomplish this, I first built a new virtual machine running Server 2012 R2, patched it to current, joined it to the domain, and granted the appropriate rights for the vCenter service account.

It’s also important to note that the existing vCenter 6.0 must be running Update 1 or later for this to work.  Obviously, you should deploy a new PSC using the same build as the existing vCenter.  Patch up your current vCenter up to Update 1 or higher obviously if needed.

Also, make sure you have a good rollback plan, like a whole VM backup or snapshots as needed.

This process works just as well for the appliance.

You then install an external PSC joining the existing SSO domain and site.  Now there are two PSCs, but vCenter is still setup in an embedded configuration, so the external PSC isn’t used yet.

At this point, you need to use the cmsso-util utility with the reconfigure option, located in your vCenter installation folder.  It’s typically under C:\Program Files\VMware\vCenter Server\bin folder.

cmsso-util reconfigure –repoint-psc destpsc.vs6lab.local –username administrator –domain-name “vsphere.local” -passwd “P@ssw0rd”

I immediately ran into my first issue…

repointpscdnserror

“The provided Platform Services Controller(PSC) is not a replication partner of the localhost. Please make sure to provide the Primary Network Identifier (PNID) of the PSC.”

A little googling led me quickly to this community post that states the DNS name is apparently case sensitive, so check your DNS records to see if maybe it’s all caps, or what.  Use that, and you’re golden.  In my case, it was DESTPSC.vs6lab.local.

Sit back and be patient.  Mine took probably a solid 10 minutes, but I am running it in a slower lab environment.

When it’s finished, verify the vSphere Web Client is functioning.  Also, verify the PSC has been repointed under your vCenter Server – Manage – Advanced Settings – config.vpxd.sso.admin.uri

confirmpscrepoint

PSC is done!

Updating vSphere 6 vCenter Server Appliance

If you skipped the first release of vCenter 6 and deployed Update 1, a new version of vCenter was released for Update 1 with some security fixes among other things.  Many people are opting for the appliance version of vCenter for the first time, and patching it isn’t like the Windows version, so I wanted to document my experience with how to install updates for the vSphere 6 vCenter Server Appliance.

First off, friendly reminder, RTFM with this kind of thing.  I’m screwing around in my lab, so I didn’t as I began and immediately ran into issues, as you’ll see, but it was my fault.

Step 1:  Check interoperability with all vSphere components, third party products, and note upgrade paths.

If you are using any products that interact with vCenter from VMware such as Horizon View, vCenter Operations Manager, Site Recovery Manager, or third party products such as backup products (Veeam, etc.), management products (VMTurbo), etc., ensure you are using versions that are supported with the new version of vCenter you are about to upgrade to, and if not, ensure you map out the proper order and new versions you need to install in order to preserve functionality for all your products and services.  Don’t forget to check support on your external database if you use one, too.

I’m assuming you’ve taken care of all this already.

Step 2: Download all your relevant files you’ll need.

At a minimum, you’ll need to download the patch file from VMware.  This is NOT the full install version of the appliance!  You need to go to:

https://my.vmware.com/group/vmware/patch

Filter for patches for vCenter, the major version of vCenter, and download the applicable patch file for your deployed version of the appliance.

I didn’t RTFM, so I downloaded the VCSA full installable file ISO, and got greeted with the following:

Command> software-packages stage –iso –acceptEulas
[2016-01-09T19:31:01.009] : Staging software update packages from ISO
[2016-01-09T19:31:01.009] : ISO unmounted successfully
[2016-01-09T19:31:01.009] : CD drives do not have valid patch iso.
[2016-01-09T19:31:01.009] : Staging process failed.

Get the patch file!

If you use the Appliance Management Interface to do this, you can have it automatically download the correct file for you.  The upgrade ISO files aren’t the smallest files, so I would encourage you to download it and have it ready.  If you’re curious, the patch file I downloaded for this was 1.5GBs.  You don’t want to eat up your planned downtime by waiting for an ISO.

Step 3:  Ensure a backout plan if it fails.  Take whole VM backups of all relevant vCenter VMs – Platform Services Controller and vCenter.  Take a VM snapshot as well for faster rollback.

The remaining steps are repeated for external PSCs and vCenter servers.  Just ensure you update all external PSCs before you update vCenter server nodes.  Don’t forget to test PSC functionality prior to continuing with the vCenter servers.

Step 4: Mount the patch ISO file into the VM if you are doing this via command line, or which to use a manually downloaded ISO instead of having vCenter download it for you.

Straightforward step here.  If you don’t know how to do this, you probably should stop now. 🙂

Step 5: Initiate the upgrade command

Command line method

Enable SSH on the appliance via the VCSA DCUI, and putty into the VM, and run the following:

software-packages install –iso –acceptEulas

(That’s double hyphens.)

You can seed the install files as well if you like, but I personally don’t see much advantage in doing this.

GUI

Using a web browser, log in to the vCenter Server Appliance Management Interface.  (Port 5480 using https), ensure the repository is configured properly (probably “Use default option”) if you want vCenter to download the patch ISO for you, initiate a check for patches.  Select URL if you want vCenter to download the patch for you, or select Check CDROM if you downloaded the ISO already and mounted it.  Finally, click Install Updates.

Step 6: Monitor the install progress and follow the instructions.

Monitor the installation, and ensure that it succeeds.  It’s completed when you are back to the Command> prompt if you’re using the command line.  You should also see:

Packages upgraded successfully, Reboot is required to complete the installation.

Reboot the VCSA VM if you are instructed to do so using:

shutdown reboot -r “vCenter 6.0 Update <whatever version you’re installing”

If you’re updating with the GUI, you should see a Reboot option under Summary.

If you have errors, review the /var/log/vmware/applmgmt/software-packages.log file.

Step 6: Dismount the ISO

Again, simple stuff.

Step 7 – Verify functionality of vCenter and integrated products

Step 8 – Clear out VM snapshot

Obviously, do not do this until you’re sure you don’t need to rollback.  With that said, do NOT keep the snapshot indefinitely either, as it will degrade vCenter performance, use up additional space on your datastore, and increases the chance of data corruption the longer you wait.

And there you have it!

Adventures in SRM 6.0 and MirrorView

Recently, I setup SRM 6.0 with MirrorView storage based replication.  It was quite the adventure.  The environment was using SRM 5.0 and MirrorView, and we upgraded them to vSphere 6.0 and SRM 6.0 recently.  I wanted to get my findings down in case it may help others setting this up.  I found when I ran into issues, it wasn’t easy finding people who were doing this, as many who are using VNXs are using RecoverPoint now instead of MirrorView.

Version Support

First off, you might be wondering why I recently deployed SRM 6.0 instead of 6.1.  That’s an easy question to answer – currently, there is no support for MirrorView with SRM 6.1.  I’m posting this article in 11/2015, so that may change.  Until it does, you’ll need to go with SRM 6.0 if you want to use MirrorView.

Installation of Storage Replication Adapter

I’m assuming you already have installed SRM, and configured the pairings and what not.  At the very least, have SRM installed in both sites before you proceed.

Here’s where things got a little goofy.  First off, downloading the SRA is confusing.  If you go to VMware’s site to download SRA’s, you’ll see two listings for the SRA, with different names, suggesting they work for different arrays, or do something different, or are different components.

mirrorsradownload

They’re actually so far as I can tell two slightly different versions of the SRA.  Why are they both on the site for download?  No idea.  So I went with the newer of the two.

You also need to download and install Navisphere CLI from EMC for the SRA to work.  There are a few gotchas on the install of this to be aware of. Install this first.

During installation, you need to ensure you check the box “Include Navisphere CLI in the system environment path.”

navispherepath

That’s listed in the release notes of the SRA, so that was easy to know.  You also need to select to not store credentials in a security file.

I ended up having issues with the SRA being able to authenticate to the arrays when I originally told it to store credentials thinking this could allow easier manual use of Navisphere CLI should the need arise, but that messed things up, so uninstalled, and reinstalled Navisphere CLI without that option, and the bad authentication messages went away.

Next, install the SRA, which is straight forward.  After the installation of the SRA, you must reboot the SRM servers, or they will not detect that they have SRA’s installed.  That takes care of the SRAs.

Configuring the SRAs

Once you have installed the SRA’s, it’s time to configure the array pairs.  First, go into Site Recovery within the vSphere Web Client, and click Array Based Replication.

arraybasedreplication

Next, click Add Array Manager.

addarraymanager

Assuming you’re adding arrays from two sites, click “Add a pair of array managers”.

addarraypairs

Select the SRM Site location pair for the two arrays.

sralocationpair

Select the SRA type of EMC VNX SRA.

selectsratype

Enter the Display name, the management IPs of the array, filters for the mirrors or consistency groups if you are using MirrorView for multiple applications, and the username and password info for the array for each site.  Be sure to enter the correct array info for the indicated site.

sraarrayinfo

I always create a dedicated SRM service account within the array, so it’s easy to audit when SRM initiates actions on the storage array.

You’ll need to fill the information out for each site’s array.

Keep the array pair checked and click next.

enablearraypairs

Review the summary of action and click finish.

At this point, you can check the array in each site and see if it is aware of your mirrors being replicated.

checksrareplicationinfo

So far so good!  At this point, you should be able to create your protection groups and recovery plans, and start performing tests of a test VM and recoveries as well.

Problems

I began testing a test Consistency Group within MirrorView, which contained one LUN, which stored a test VM.  Test mode worked immediately to DR.  Failover to the DR site failed, as it often does in my experience with most Storage Based Replication deployments.  No problem, I simply launch it again, and it works, and it did in this case.

With the VM then in the DR site, I performed an isolated test back to production, which worked flawlessly.  It’s when I tried to fail back to production I encountered a serious problem.  SRM reported that the LUN could not be promoted.  Within SRM, I was given only the option to try failover again.  The icon was grayed out to do cleanup or a test.  Relaunching failover resulted in the same result.  I tried rebooting both SRM servers, vCenter, running rediscovery of the SRAs, you name it.  I was stuck.

I decided to just manually clean up everything myself.  I promoted the mirror in the production site, had hosts in both sites rescan for storage.  The LUN became unavailable in the DR site, but in production, while the LUN was visible in terms of seeing an available LUN, the datastore wouldn’t mount.  Rebooting the ESXi server didn’t help.  I finally added it as a datastore, selecting not to resignature the datastore.  The datastore mounted, but I found that the datastore wouldn’t mount after a host reboot.  Furthermore, SRM was reporting the MirrorView consistency group was stuck failing over, showing Failover in Progress.  I tried recreating the SRM protection group, re-adding the array pairs, and more, but nothing worked.

After messing with it for awhile, checking MirrorView and the VNX, VMware, etc., I gave up and contacted EMC support, who promptly had me call VMware support, who referred me back to EMC again because it was clearly an SRA problem for EMC.

With EMC’s help, I was able to cleanup the mess SRM/SRA made.

  1. The Failover in Progress reported by the SRA was due to description fields on the MirrorView description view.  Clearing those and rescanning the SRAs fixed that problem.
  2. The test LUN not mounting was due to me not selecting to resignature the VMFS datastore when I added it back in.

At this point, we were back to square one, and I went through the gambit of tests. I got errors because the SRM placeholders were reporting as invalid.  Going to the Protection Group within SRM and issuing the command to recreate the SRM placeholders fixed this issue.

We repeated testing again.  This time, everything worked, even failback.  Why did it fail before?  Even EMC support had no answer.  I suspect it’s because anytime I make the first attempt in a direction in an SRM environment to failover, it always fails.  Unfortunately, it was very difficult to fix this time.

Disable CBT on Veeam jobs via PowerShell

If you haven’t heard the not so great news, VMware has discovered a bug  in vSphere 6 with Change Block Tracking (CBT) that can cause your backups to be corrupt and therefore invalid.  Currently, they are recommending not to use CBT with vSphere 6 when backing up your VMs.

I was looking for an easy way to disable this on all jobs in Veeam quickly via PowerShell, but it’s not obvious how to do that, so I took some time to figure it out.  Here it is assuming the module is loaded in your PowerShell session.

$backupjobs = get-vbrjob | where jobtype -eq "Backup"
foreach ($job in $backupjobs){
$joboptions = $job | get-vbrjoboptions
$joboptions.visourceoptions.UseChangeTracking = $false
$job | set-vbrjoboptions -options $joboptions
}

Here’s now to enable it again:

$backupjobs = get-vbrjob | where jobtype -eq "Backup"
foreach ($job in $backupjobs){
$joboptions = $job | get-vbrjoboptions
$joboptions.visourceoptions.UseChangeTracking = $true
#$joboptions.visourceoptions.EnableChangeTracking = $true
$job | set-vbrjoboptions -options $joboptions
}

Sorry it’s not pretty on the page, but I wanted to get this out ASAP to help anyone needing to do this quickly and effectively.

One thing to note is in the enable script, there’s a commented line out.  If you have already set your jobs manually and wish to use the script to enable CBT again, be aware that the option to enable CBT within VMware if it is turned off gets disabled if you turn CBT off altogether within the job setup.  If you disable CBT with my script, that doesn’t get touched, so you don’t need to remove the # on that line.   If you want that option enabled again, take out the # before that line, and it’ll enable that option again.

Hope this helps!

vSphere 6 – Certificate Management – Part 2

Previously, I posted about certificate management in vSphere 6, which has simplified the process and provided several ways to trust certificates that will be used, while providing flexibility about what will issue the certificates to vSphere internal functionality and client facing functionality as well.

One of the simplest means to allow your clients to trust SSL certificates used by vSphere 6 is to utilize the CA functionality built into vCenter 6, and configure your clients to trust it as a Root CA, which is the focus of this blog post.

Basically, to do this, you need to obtain the root certificates of your vCenter servers, and the install them into your clients Trusted Root CA stores.  Thankfully, this is a relatively straightforward and easy process.

Obtaining Root CA files for vCenter

Obtaining the root CA files for vCenter is very easy.  To do this, simply connect to the FQDN of your vCenter server using HTTPS.  Do not add anything after this, like you may to connect to the vSphere Web Client.  For example, you would use:

https://vcentersvrname.domain.com

Once connected, you will see a link to obtain the vCenter certificates, as you can see below:

Download vCenter root CA certsWhen you download this file, it will be a zip file containing two certificate files for your vCenter server.  If your vCenter server is part of a linked mode installation, you will have two certificate files for every vCenter in the linked mode instance.  The files with an .0 extension are the root CA files you need to import.  Below, you can see the zip file downloaded from a vCenter in a two vCenter server linked mode installation.

vcenterdownloadfile
Extract the contents of the zip file.  Next, rename the .0 files with a .cer extension.  This will allow the files to be easily installed within a Windows machine.  You can then open them to check out the properties of the files if you like.

Installing Root CA file(s)

If you’re familiar with Windows machines, this is pretty straight forward.  You would either import this file into each machine manually, or you can use a Group Policy Object, import the file(s) into it, and refresh your GPO.

That’s it!  Pretty easy to do!  At the very least, you should do this instead of blindly allowing the exception for the untrusted certificate every time, because we all know we aren’t checking the thumbprints of those certs to ensure we’re connecting into the same server, plus this removes the warnings if you’re not creating a permanent exception.

Getting Cisco UCS drivers right with Windows

I’ve already mentioned one pain point with Cisco UCS – drivers – in my previous post concerning vSphere, but the same thing applies to other environments, including Windows servers. You better have the EXACT version Cisco wants for your specific environment. But how do you know which drivers to get, how do you get them, how do you know when you need to upgrade them, and how do you know what drivers you have installed? This post applies to Windows Server, which by extension, includes Hyper-V.

Why is getting the drivers so important?

I want to emphasize that getting the exact right version of Cisco UCS drivers is a big deal! I’ve personally now seen two vSphere environments that had issues because the drivers were not exactly correct. The best part is the issues never turned up during a testing of the environment. Just weird intermittent issues like bad performance, or VMs needing consolidation after backups, or a VM hangs out of nowhere a week or two down the road. Make sure you get the drivers exactly right!  I don’t work with Windows Servers on bare metal nearly as much as VMware, but I’m sure getting the drivers right is equally, if not more, crucial.

How do I install Windows Server 2012 on Cisco UCS?

You have two choices.  One is create a Windows Server installation ISO with the drivers slipstreamed into it, or you can insert the driver image during the routine to install the proper storage driver to see your storage to which you’ll install Windows, which is available from Cisco for download. Also, at least currently, remember that Cisco UCS does not support Windows in a boot from SAN configuration using FCoE.

Remember however you’re still not done.  You’ll need to still update the network card drivers.

How do I know which drivers should be installed?

This is relatively simple. First, collect some info about your Cisco UCS environment. You need to know these (don’t worry, if you don’t know what info you need, Cisco’s Interoperability page will walk you through it):
1. Standalone C-Series not managed by UCSM or UCSM managed B and/or C-Series? For those of you who don’t know, if you got blades, you got UCSM.
2. If UCSM is present, which version is it running? Ex. 2.2(3c)
3. Which server model(s) are present? Ex. B200-M3. Also note the processor type (ex. Xeon E5-2600-v2). They can get picky about that.
4. What OS and major version? Note there is a difference between support for Server 2012 and 2012 R2.  Cisco at least at the time of this blog post does not change support depending upon installed Service Packs.
5. What type and model of I/O cards do you have in your servers? Example – CNA, model VIC-1240

Then head on over to the Interoperability Matrix site.  Fill in your info, and you get a clear version of the driver and firmware.

ucswindriverlookup

It’s very straightforward to know which drivers are needed from that.

How do I figure out which drivers are installed?

 

You can do this one of two ways.  You can manually check them via Device Manager, or you can use PowerShell.  I’m assuming everyone knows how to check these with Device Manager.  To use PowerShell, use the following:

Get-WmiObject Win32_PnPSignedDriver | select-object devicename,driverversion

 

Note that you can use the -ComputerName parameter to check a remote system, or even a PowerShell array of remote systems for their drivers easily.

How do I update Cisco UCS drivers?

You need to go to Cisco’s support page for UCS downloads, and download the driver ISO that has your driver, which is typically just the driver ISO with the same version as UCS Manager.  For example, if you’re running 2.2.5(a), you need the 2.2.5 driver ISO.

Next, use the virtual media option within the Cisco KVM, mount the ISO, so it’s ready to go.  You could also extract the contents somewhere on the server, it doesn’t matter.

Next, login to the server, pull up Device Manager, and locate the adapter instances you need to update.  If you’re using a VIC, which is pretty much everyone, you need to find the relevant storage and network adapters.  This example, the customer wasn’t using the FC functionality, so we’re just doing the ENIC devices.

updatingwinucsdrivers

When the dialogue comes up, browse to select the zip file that was *contained* in the original zip file. If you select just the zip file you downloaded itself, it will fail. Repeat for the fnic and enic drivers.

Double click one of the VIC instances, click the Driver tab, note the currently installed driver version, and then Update Driver…

Next, click Browse my computer for driver software.

Next, click “Let me pick from a list of device drivers on my computer”.  Don’t even bother trying any other options, it will continue to want to install the old driver because… REASONS!!!

Next, click Have Disk… and browse on the CD to the right folder for the hardware and OS you’re running.  This is a Server 2012 non-R2 server, so as an example, it’s under Windows\Network\Cisco\VIC\W2K12\x64.

After the driver installs, verify the new driver version is now showing in Device Manager.  Unfortunately, you’re not done yet.  You gotta update the drivers on every other instance, but it’s a little easier for the rest.

For the other instances, double click each one, click Update Drivers…, “Browse my computer for driver software”, “Let me pick from a list of drive drivers on my computer”, and you will see both the old and new driver versions.  Pick the new one, and click Next.

updatingwinucsdrivers2

Rinse and repeat for all instances, and ensure the driver tab reflects the proper new version.  Remember that FNIC HBA drivers will need to be updated on those instances separately under Storage Controllers.

I took a quick look to see if Microsoft made some device driver PowerShell cmdlets, but unfortunately I don’t see any at this time.

When should I check my drivers?

You should do this during any of the following:

• During initial deployments
• When UCS Manager or a standalone C-Series BIOS is updated
• If you update to a new major OS, although I’d check when planning to install Service Packs to Windows as well.

Also, remember, newer drivers aren’t the right drivers necessarily. Check the matrix for what the customer is or will be running to see which drivers should go along with it!

Getting Cisco UCS drivers right with vSphere

I’ve noticed one pain point with Cisco UCS – drivers. You better have the EXACT version Cisco wants for your specific environment. But how do you know which drivers to get, how do you get them, how do you know when you need to upgrade them, and how do you know what drivers you have installed? These are all not necessarily straightforward, and getting the info you need can be a real pain.  This post will show how to accomplish this within vSphere.  For Windows servers, please see my follow-up post due out in a few days.

Why is getting the drivers so important?

I want to emphasize that getting the exact right version of Cisco UCS drivers is a big deal! I’ve personally now seen two environments that had issues because the drivers were not exactly correct. The best part is the issues never turned up during a testing of the environment. Just weird intermittent issues like bad performance, or VMs needing consolidation after backups, or a VM hangs out of nowhere a week or two down the road. Make sure you get the drivers exactly right!

How do I install ESXi on Cisco UCS?

First off, pretty much everyone knows that when you’re installing ESXi on Cisco, HP, Dell, IBM, or other vendor servers, use the vendor’s media. That’s common practice I hope by now. In most but not all cases, you get the drivers you need for an initial deployment from the get go, you get hardware health info within VMware, sometimes management and monitoring tasks for out of band management cards, and you ensure vendor support by doing this. We all know I think by now to do initial ESXi installs with vendor media, in this case Cisco. It’s important for Cisco UCS since so many installs require boot from SAN, that you gotta have those drivers within the media off the bat.

Now, if you think you’re done if you’ve downloaded the latest Cisco co-branded ESXi media for an initial deployment, you’re wrong (see below). Also, don’t assume that just because you use the co-branded media to install ESXi on a UCS server, you never need driver updates. You will likely when you update UCS Manager and/or update ESXi down the road.

How do I know which drivers should be installed?

This is relatively simple. First, collect some info about your Cisco UCS environment. You need to know these (don’t worry, if you don’t know what info you need, Cisco’s Interoperability page will walk you through it):
1. Standalone C-Series not managed by UCSM or UCSM managed B and/or C-Series? For those of you who don’t know, if you got blades, you got UCSM.
2. If UCSM is present, which version is it running? Ex. 2.2(3c)
3. Which server model(s) are present? Ex. B200-M3. Also note the processor type (ex. Xeon E5-2600-v2). They can get picky about that.
4. What OS and major version? Note the Update number. Ex. ESXi 5.5 Update 2
5. What type and model of I/O cards do you have in your servers? Example – CNA, model VIC-1240

Then head on over to the Interoperability Matrix site.  Fill in your info, and you get a clear version of the driver and firmware.

ucsdriverlookup

It’s very straightforward to know which drivers are needed from that.

How do I figure out which drivers are installed?

If you go looking at Cisco for how to find that out, you get treated to esxcli commands.  Do you really want to enable SSH on all your hosts, SSH into each host, run some commands, then have to disable SSH on all those boxes when you’re done, and not have an easy way to document what they are?  Nope!

BEHOLD! POWERCLI!

To get the fnic driver versions for all ESXi hosts:

$hosts = Get-VMHost
$versions = @{}
Foreach($vihost in $hosts){
$esxcli = Get-VMHost $vihost | Get-EsxCli
$versions.Add($vihost, ($esxcli.system.module.get("fnic") |
Select Version))
}
$versions.GetEnumerator() | Sort Name | Format-List 

You get this:

Name : esxi01.vspheredomain.local
Value : @{Version=Version 1.6.0.12, Build: 1331820, Interface: 9.2 Built on: Jun 12 2014}

Hey! That’s the wrong driver, even though I used the latest co-branded media! SON OF A…!

Let’s get some enic driver versions…

$hosts = Get-VMHost
$versions = @{}
Foreach($vihost in $hosts){
$esxcli = Get-VMHost $vihost | Get-EsxCli
$versions.Add($vihost, ($esxcli.system.module.get("enic") |
Select Version))
}
$versions.GetEnumerator() | Sort-Object Name | Format-List 

You get:

Name : esxi01.vspheredomain.local
Value : @{Version=Version 2.1.2.59, Build: 1331820, Interface: 9.2 Built on: Aug 5 2014}

Of course, Cisco apparently didn’t update those drivers in their co-branded media either.

Note for both scripts, you will get errors about get-esxcli not being supported without being connected directly to each host. It works for our purposes.

How do I update Cisco UCS drivers?

Now we know, despite using the latest Cisco co-branded media in my implementation, I need some driver updates. If you go to Cisco’s site for how to install these drivers, they’ll tell you to upload the package to each host and install them one at a time manually using esxcli commands. Do you really want to do that?

Let’s be smart/lazy/efficient and use VMware Update Manager. That way if a new host gets introduced, VUM will report that host non-compliant, and it’ll be easy to fix that one, too. And it’s easy to see which hosts do and don’t have those drivers down the road.

I find if I google the driver version, I find a download from VMware’s site with that exact version first or second link. Here’s our fnic driver and enic driver in this case.

Download those to your vCenter server or something with the vSphere thick client. Unzip them into their own folders. Open up a thick vSphere client connection to vCenter (Web Client won’t allow you to do this), click Home, then click Update Manager.

Next, click Patch Repository tab at the top, and then click Import Patches in the top right.

vumimportpatches

When the dialogue comes up, browse to select the zip file that was *contained* in the original zip file. If you select just the zip file you downloaded itself, it will fail. Repeat for the fnic and enic drivers.

When you’re finished, you can then build a baseline that includes the updated drivers. Click Baselines and Groups, then Create above the baselines pane.

vumcreatebaseline

Call it something like “Cisco UCS Current Drivers”.  Select “Host Extension” as a Host Baseline type.  In the following pain, find the drivers and click the down arrow to add them into the baseline.  Note the Patch ID field has driver version specifics, useful if you’ve already got some Cisco drivers imported before.

vumselectpatches

You can then attach that baseline directly to the appropriate object(s) within the host and clusters view, or I like to make a Baseline Group called “Critical and non-critical patches with Cisco updated drivers”, add all the appropriate baselines to that group, and attach that group to the appropriate objects in the Hosts and Clusters view.

Then remediate your hosts. When new drivers come out, import them in, then edit the Cisco baseline, swapping out the last updated drivers with the new ones, and remediate to push them out.

Done!

When should I check my drivers?

You should do this during any of the following:

• During initial deployments
• When UCS Manager or a standalone C-Series BIOS is updated
• Major ESXi version upgrades
• Update pack upgrades for ESXi (when ESXi 5.5 servers for example are to be upgraded to Update 2, or 3, etc)

Also, remember, newer drivers aren’t the right drivers necessarily. Check the matrix for what the customer is or will be running to see which drivers should go along with it!

Howto: Fix vCenter 5.5 Syslog Collector bug

I just ran into an issue for a customer that is running vCenter 5.5.  Apparently, there is a known issue with the 5.5 version of the syslog collector that causes the debug log to grow indefinitely when it was upgraded  to 5.5 according to the article.  However, the customer in question was built fresh with 5.5, although I did update within 5.5 to newer builds during that time.  Bottom line is it should be checked if they’re running 5.5 version of the syslog collector in all cases to be sure.  The KB article outlines steps to stop this, which basically involves turning off debug logging altogether.

The debug log doesn’t contain actual syslog info for hosts, so this log file is only useful for troubleshooting issues with the syslog collector itself, so it’s almost certainly safe to delete.

Please note this only impacts the syslog collector.  If you did not install the syslog collector, this isn’t applicable.

You can copy and paste the following into an admin elevated PowerShell window to automate stopping the syslog service, changing the config file for the syslog collector to turn off debug logging completely, delete the probably massive debug log, and start the syslog collector again.  You can also save it as a .ps1 file and run it in an elevated prompt as well.

stop-service vmware-syslog-collector
(get-content "C:\ProgramData\VMware\VMware Syslog Collector\vmconfig-syslog.xml") | foreach-object {$_ -replace "<level>1</level>", "<level>0</level>"} | set-content "C:\ProgramData\VMware\VMware Syslog Collector\vmconfig-syslog.xml"
remove-item "C:\ProgramData\VMware\VMware Syslog Collector\logs\debug.log"
start-service vmware-syslog-collector

This assumes of course the syslog collector is running on a Windows machine, not the vCenter Appliance.  The article doesn’t seem to make clear if this issue only applies to the Windows version of Vcenter, and how to fix the vCenter Appliance if the issue did impact it as well.

Hope this helps!

Fix AD Lingering Objects with PowerShell

I briefly ran a blog before on wordpress.com, and most of the information there is outdated, or probably not relevant today, but there are a few posts that I’ve found little else on the internet to address.  These typically harken back to my AD/Exchange heavy days, but they’re still relevant today.  One of those posts is how to fix Active Directory lingering objects using PowerShell.

I ran into a problem in a large forest with multiple child domains and lots of domain controllers – 10 domains and 275 domain controllers!

To protect identities, let’s assume a forest consisted of domain.com, with two child domains – child1.domain.com and child2.domain.com.  Each domain has 2 global catalog servers (gc1, gc2), and one domain controller that is not a global catalog (dc1).

What are lingering objects anyway?

Remember that at least one domain controller in each domain must be a global catalog server.  GC’s have a copy of all objects in the forest, but only a subset of each object’s properties is found in AD.  For all objects in a GC that are not in that domain controller’s domain, the GC has a read-only copy.  You cannot manually go in and alter, create, or delete objects directly in the Global Catalog for objects that reside in another domain.

Lingering objects occur when through a variety of ways, a global catalog in one domain ends up with objects that no longer exist in another domain.  For example, let’s say a user exists in child2.domain.com and is deleted.  If somehow this doesn’t replicate to a GC in child1.domain.com or domain.com, the global catalogs in domain.com and child1.domain.com now have that user as a lingering object.  This can occur through a variety of ways, such as replication failures, or a global catalog server was disconnected for a long period of time.

Further info can be found here.

To find if you have lingering objects on a domain controller, you must run the following command:

repadmin /removelingeringobjects ServerName ServerGUID DirectoryPartition /advisory_mode

Simply remove the /advisory_mode switch to remove lingering objects.

ServerName is the fully qualified domain name of a global catalog that has lingering objects.  ServerGUID is a domain controller’s GUID from the domain that the lingering object is from, and you’d like to use it as a reference.  DirectoryPartition is the distinguished name of the GC partition with the lingering object.  Usually, lingering objects are computer or user account objects, so this would look like dc=domain,dc=com.
Finding the DC’s GUID can be done by looking in the forward lookup zone _msdcs.domain.com.

Lingering objects can cause problems with outdated or invalid group membership, problems with address book generation with Exchange, or basically problems with anything that depends upon valid info within the global catalog.  It can even cause replication failures depending upon your global catalog replication topology, and if you have strict replication enabled.

Scenario
Let’s say you suspect gc1.child1.domain.com has lingering objects from child2.domain.com.  You would first need a GUID of a DC in child2.domain.com that you believe has accurate domain information.  Let’s say you believe that dc is GC2.child2.domain.com.  Use the DNS MMC, connect to a DNS server hosting domain.com, look in the _msdcs.domain.com zone, and you will see all domain controllers in your forest.  Copy the GUID to your clipboard.  Let’s say GC2.child2.domain.com’s GUID is:

85d158d2-a006-4fff-b1e5-f9b6eaabab2b

You would then run:
repadmin /removelingeringobjects gc1.child1.domain.com 85d158d2-a006-4fff-b1e5-f9b6eaabab2b dc=child2,dc=domain,dc=com /advisory_mode

Note you need the Windows Support Tools installed.

This isn’t so tough.

However, if you suspected all your global catalogs had lingering objects for this domain, you’d need to run this command for each GC not in child2.domain.com.  Not terrible for this small of an environment.  To fix them, just chop off the advisory mode switch, and you’re done.

Think Big!

What if your environment was a 10 domain forest with over 100 domain controllers, and no predictable pattern of which domain controllers were global catalogs and which weren’t?!  Even if you knew which were global catalogs, who wants to issue that many commands?!

Wouldn’t it be nice is if we could issue this command to every global catalog not in child2.domain.com (since their GC’s have writable copies of the partition, theirs would be correct and would fix lingering objects on their own)?

That is what I faced.  I found replication wasn’t occuring for a domain partition in the global catalog because strict replication was enabled, and all global catalogs outside of a particular domain had lingering objects.  Talk about a pain in the butt!  Unless of course…
PowerShell to the rescue!

We can easily get all the global catalogs in the forest:
$forest = [system.directoryservices.activedirectory.Forest]::GetCurrentForest()
$forest.globalcatalogs | select-object name

You would receive output of the fully qualified domain names of all global catalogs.
But wait.  We only want GC’s that are NOT in child2.domain.com.  Simple enough with a where-object filter.

$forest.globalcatalogs | where-object {$_.name -notlike “*.child2.domain.com”} | select-object name

Now we just need to set this to a variable, so we add “$gcs = “ to the beginning of the second line.  This will allow us to have an array we can then perform an action or command on.  The last part is a bit tricky because we’re intermixing PowerShell with a standard command line.  Usually, you need to use the ‘ character around phrases.  Also, in this case, we’ve actually grabbed objects within the $gcs variable, so we want to make sure we’re not passing any other properties or code associated to objects.  We literally just want the name of each to be passed.  Remember, $_ means every object in the pipeline.  By adding .name, we’re saying don’t pass any other output related to each object in the array other than it’s name.  Without it, you get errors because PowerShell is putting extra characters in for each Global Catalog.

Final commands:

$forest = [system.directoryservices.activedirectory.Forest]::GetCurrentForest()
$gcs = $forest.globalcatalogs | where-object {$_.name -notlike "*.child2.domain.com"} | select-object name
$gcs | foreach-object {repadmin /removelingeringobjects $_.name 85d158d2-a006-4fff-b1e5-f

Auto download/install Dell/HP updates with VUM

Recently, I had a customer run into an issue with a bug in the HP agents included within their co-branded installation media, so I came to realize the importance of updating server vendor custom software.

http://kb.vmware.com/kb/2085618

I decided to look into how to manage updating those a little easier since I’m having to update this kind of thing for customers lately. It turns out with Dell and HP, it’s not tough. (And BTW, Cisco and IBM – come on and get with the times on this!)

Did you know you can add a Dell and/or HP download repository for VUM to check for these updates for you? I knew you could, but I’ve never done it until now since we typically have customers maintain their stuff, but I’m involved in a few customers who want me to do it for whatever reason. And hey, I’m lazy, so screw doing this the hard way.

Here’s how:
Open the full vSphere Client with the VUM plugin installed and enabled. Open the Update Manager management section. Click on the Configuration tab -> Download Settings. Then, click on Add Download Source.

vumadddownloadsource

Next, enter the source URL for your server manufacturer:

Dell: http://vmwaredepot.dell.com/index.xml

HP: http://vibsdepot.hp.com/index.xml

Edit:  HP’s download locations have changed!  Use:

Drivers: http://vibsdepot.hpe.com/index-drv.xml

All other components: http://vibsdepot.hpe.com/index.xml

Enter a description like “HP VIB Depot”.  Click on Validate URL to ensure that’s good, and OK.vumdownloadsource

Boom, take a look and make sure the connectivity status is Connected, and you can click Download Now if you want to get the latest updates from them immediately.

Now you need to make a baseline that includes the patches, and you can make a dynamic baseline to automatically update with the latest ones.  Go to the baselines tab, create a baseline, name it something with the software vendor name and ESXi version and select the Host Patch type.  For Patch Options, select Dynamic.  For criteria, select the server vendor, and the specific version of ESXi you’re updating.  Note, this baseline will only work for a specific major version of ESXi.  If you don’t select a version to include all version patches, you’ll get errors when you remediate.vumbaselinecriteria

Next, you can select any patch to exclude anything you don’t want installed.  Newer versions supersede the older ones, so there’s no need to exclude anything unless the latest version you know causes problems.vumbaselineexclusions

 

There probably isn’t a reason to add additional Updates manually to this baseline.  If you need to add other patches, make another baseline for that, and include everything you want in a baseline group.vumaddadditionalpatches

 

Now add the new baseline to the appropriate Baseline groups as needed, scan and remediate, and you’re off to the races.

How cool is that?