Nutanix administration do’s and don’ts

As a virtualization consultant, I know there’s a wide variety of technologies at every level – hypervisor, storage, networking, and even server hardware is getting to some degree more complex in terms of what you need to know to manage it effectively.  Everyone can’t be an expert in every single storage technology as an example, and with more and more options that are radically different in their architecture, right now I wanted to make my own little contribution to the world for consultants and admins alike on basic things you should and shouldn’t do with one storage solution – Nutanix.  For us consultants, we often find ourselves within environments with something we’re not totally familiar with, so some helpful concise guidance can go a long way.  Admins, too, may have depended upon a consultant or previous colleagues that no longer work there for implementation and support, but now it’s on them, so I thought this would be helpful.

There are quite a few things everyone should know if they ever are working on a environment with Nutanix that aren’t necessarily obvious.  I can see it being pretty darn easy to blow up a Nutanix environment if you’re not aware of some of these things.

Common stuff

  • Contact Nutanix Support before downgrading licensing or destroying cluster to reclaim licenses (unnecessary if you’re using Starter licensing though). This was repeated many times, so I’m guessing if this isn’t done, you’ll be hating life getting licensing straight.
  • Do NOT delete the Nutanix Controller VM on any Nutanix host (CVM names look like: NTNX-<blockid>-<position>-CVM)
  • Do NOT modify any settings of a Controller VM, all the way down to even the name of the VM.
  • Shutdown/Startup gotchas:
    • It’s probably best to never shutdown/reboot/etc. more than one Nutanix node in a cluster at a time. If you do more, you may cause all hosts in the Nutanix cluster to lose storage connectivity.
    • When shutting down a single host or < the redundancy factor (Nutanix number of hosts it is configured to tolerate failure in a Nutanix cluster), migrate/shutdown all VMs on host EXCEPT the controller VM, THEN shutdown the controller VM.
    • If you are shutting down a number of hosts that exceeds the redundancy factor, you need to shutdown the Nutanix cluster. There’s also a specialized procedure to start up the Nutanix cluster in this situation.  That’s beyond the scope of this email.
    • When booting up a host, do the following:
      • start the Controller VM first that resides on it, and verify it’s services are working by SSH to it using:
        • Ncli cluster status | grep –A 15 <controllerVmIP>
      • Then have it rescan its datastores.
      • Then verify the Nutanix Cluster state using the following to ensure cluster services are all up via same SSH session:
        • cluster status
  • Hypervisor Patching
    • Make sure to patch one hypervisor node and ensure Controller VM comes back up with services are good before proceeding to the next one. Also do one at a time in a Nutanix cluster (see above).
    • Follow shutdown host procedure above.

vSphere

  • NEVER use “Reset System Configuration” command in Nutanix.
  • If resource pools are created, Controller VM (CVM) must have the highest share.
  • Do NOT modify NFS settings.
  • VM swapfile location should be the same folder as the VM. Do NOT place it on a dedicated datastore.
  • Do NOT modify the Controller VM startup/shutdown order.
  • Do NOT modify iSCSI software adapter settings.
  • Do NOT modify vSwitchNutanix standard vSwitch.
  • Do NOT modify Vmk0 interface in port group “Management Network”.
  • Do NOT disable ESXi host SSH.
  • HA configuration recommended settings:
    • Enable admission control and use percentage based policy with value based on number of nodes in cluster
    • Set VM Restart Priority for CVMs to Disabled.
    • Set Host Isolation Response of cluster to Power Off
    • Set Host Isolation Response of CVMs to Leave Powered ON.
    • Disable VM Monitoring for all CVMs
    • Enable Datastore Heartbeating by clicking Select only from my preferred datastores and choosing Nutanix datastores. If cluster has only one datastore (which would be common potentially in Nutanix deployments), add advanced option das.ignoreInsufficientHbDatastore=true to avoid warnings about not having at least two heartbeat datastores.
  • DRS stuff:
    • Disable automation of all CVMs
    • Leave power management disabled (DPM)
  • Enable EVC for lowest processor class in cluster.

Hyper-V

  • Do NOT use Validate Cluster within Failover Clustering nor SCVMM, as it is not supported. Not sure what would happen if you did, but I’m guessing it would be pretty awesome, and you probably should make sure you got popcorn ready if you’re gonna do that.
  • Do NOT modify the Nutanix or Hyper-V cluster name
  • Do NOT modify the external network adapter name
  • Do NOT modify the Nutanix specific virtual switch settings

KVM (the Hypervisor… also assuming this means if you’re using Acropolis Hypervisor from Nutanix since it’s KVM based…)

  • Do NOT modify the Hypervisor configuration, including installed packages
  • Do NOT modify iSCSI settings
  • Do NOT modify the Open vSwitch settings

I hope this proves helpful to people who unexpectedly find themselves working on Nutanix and need a quick primer to ensure they don’t break something!