Have you ever opened up the settings on one of your vSphere Clusters, only to find HA wasn’t configured to your standards? Did someone do some work during the day and forget to put the settings right after their change?
After encountering this scenario far too many times, I decided to automate the problem away using vCenter Orchestrator. I created a workflow that runs every night to ensure all settings are correct.
Goal: Create an automated workflow that sets the following
HA Settings
- Enable HA
- Enable Host Monitoring
- Enable Admission Control
- Calculate the correct HA percentages based on the amount of hosts in the cluster (I’ll discuss this more later).
DRS Settings
- Enable DRS
- Set DRS to Fully Automated.
In addition, we need to ensure no other custom settings get overwritten (For example VM specific DRS/HA settings)
A quick note about HA admission control options:
The choice really comes down to three policies for enforcing admission control when it is enabled.
- Host failures the cluster tolerates
- Percentage of cluster resources reserved as failover spare capacity
- Specify failover hosts
I won’t go into a deep discussion on each one now, but if you select Host failures the cluster tolerates, and say you select 1 host, which is fairly common, then the HA calculations are completely based off the slot size. The problem here is that if you have very large VMs, then the slot size gets skewed. It is possible to manually override the slot size that vCenter calculates, but then this means you are constantly having to maintain and update that new setting.
Duncan Epping has a great article on this at http://www.yellow-bricks.com/vmware-high-availability-deepdiv/#HA-admission
I decided to calculate my HA Admission control percentage for CPU and Memory based on the resources of a single host failing.
var HApercent = ((1/Hosts.length)*100);
HApercent = HApercent.toFixed(0);
I decided to round the value to the nearest whole number. It could be argued that if you have 3 hosts, and you use this formula you will end up with 33% (not 34%) of resources reserved. Either way I don’t think the 1% is going to be a huge problem in an HA event.
How to automate using vCenter Orchestrator:
Now we know what we are going to do, how do we automate it in Orchestrator?
1. Create the workflow to configure these settings on a specific cluster.
Make sure to add the general attributes for drsBehaviour, haHostMonitoring, and HApercent. You can do this now, or you can create them when you create the inputs for each custom script. vCO gives you the option to create a new general attribute with exactly the same name when you do this. This is my personal preferred way to do this, as it ensures your naming is consistent throughout your sub workflows.
2. Create the first scriptable task to calculate your HA % value.
var Hosts = System.getModule("com.vmware.library.vc.cluster").getAllHostSystemsOfCluster(cluster); System.log("Number of Hosts in Cluster: " + Hosts.length); var HApercent = ((1/Hosts.length)*100); HApercent = HApercent.toFixed(0); System.log("HA Percent which will be used for cluster is: " + HApercent);
Quick description of variables used:
Cluster – For this custom script object I pass the input “cluster” which is also passed as the input to the main workflow. This is of type VC:ClusterComputeResource
HApercent – This is an output variable of the custom script which maps back to the HApercent general attribute so it can be used later on in the main workflow
Basically all this script does is take the cluster, find out how many hosts there are, then use this number to calculate the percentage of host resources to reserve when configuring admission control.
3. Create the scriptable task to configure all the relevant settings on the cluster (This is the main piece)
//Check for HA Enabled haEnabled = System.getModule("com.vmware.library.vc.cluster").haEnabledCluster(cluster); if (haEnabled) { System.log(cluster.name + " is already HA enabled"); } else { System.log("HA DISABLED - " + cluster.name + " will be enaled for HA"); } //Check for DRS Enabled drsEnabled = System.getModule("com.vmware.library.vc.cluster").drsEnabledCluster(cluster); if (drsEnabled) { System.log(cluster.Name + " is already DRS enabled"); } else { System.log("HA DISABLED - " + cluster.name + " will be enaled for HA"); } System.log("Creating HA/DRS specifications"); var clusterConfigSpec = new VcClusterConfigSpecEx(); clusterConfigSpec.drsConfig = new VcClusterDrsConfigInfo(); clusterConfigSpec.dasConfig = new VcClusterDasConfigInfo(); //Enable DRS/HA System.log("Setting HA and DRS to Enabled (even if they were already)"); clusterConfigSpec.dasConfig.enabled = true; clusterConfigSpec.drsConfig.enabled = true; //Set DRS to FullyAutomated System.log("Setting DRS to Fully Automated"); clusterConfigSpec.drsConfig.defaultVmBehavior = drsBehaviour; System.log("Updating HA Admission Control policy for " + cluster.name); clusterConfigSpec.dasConfig.admissionControlPolicy = new VcClusterFailoverResourcesAdmissionControlPolicy(); clusterConfigSpec.dasConfig.admissionControlEnabled = true; clusterConfigSpec.dasConfig.hostMonitoring = haHostMonitoring; clusterConfigSpec.dasConfig.admissionControlPolicy.cpuFailoverResourcesPercent = HApercent; clusterConfigSpec.dasConfig.admissionControlPolicy.memoryFailoverResourcesPercent = HApercent; System.log("Executing Cluster Reconfiguration for " + cluster.name); //Reconfigure the cluster, by adding the True parameter this ensures any previous settings remain task = cluster.reconfigureComputeResource_Task(clusterConfigSpec, true);
The most important part is right at the bottom
task=cluster.reconfigureComputeResource_Task(clusterConfigSpec,true)
If you do not specify the true option, then it will not leave other settings you may have intact. An example might be a specific VM DRS rule you had configured. Essentially, without true, it wipes all settings to defaults and then changes only the settings specified above.
4. Now run this workflow against a cluster and check you get the desired result! Change a few settings, and run it again. Hopefully you should see everything configured to your liking again.
How do I get this to run against my entire environment?
Simple! Create another workflow which has all your clusters stored as general attributes. Then have this new workflow pass through all your clusters, calling the workflow you created above.
Schedule this in vCO to run every night and you can know comfortably that all your settings are as they should be.
If someone adds a new host to a cluster, there is no need to check the HA settings, they will be corrected that night.
If you need to say keep a cluster in partially automated mode for a while, then simply remove that cluster’s object from the general attribute. When you are ready to add it back, just click it back in.
This is just one way to keep your environment in check. My next post will be around VCAP topics, and then I will be writing another workflow on creating a Self Service Snapshot portal.
Finally, if anyone wants the actual workflow files for the above workflows, I am more than happy to get them uploaded for you.