In this post I talked about hos failover in SMA works. As you could read in that post there is a manual step to transfer jobs from one worker to another. This process could of course be automated in a couple of different ways, you can use Orchestrator for example. Today I will show how it can be done with Operations Manager.
I have setup a rule in Operations Manager that runs a command every five minutes. The command starts a Powershell script that checks how all workers are doing. First it checks which workers there are in the environment and then it checks if the Runbook Service is running on each of them.
If any of the workers, I have two in my sandbox, is not working the script will exclude them from the SMA configuration. The script will also stop and start the Runbook service (rbsvc) on the working worker server. If the script do any changes it will generate an event in the Application event. That event is picked up by another rule that generate an alert in Operations Manager.
If I then run Get-SmaRunbookWorkerDeployment I can verify that SMA02 is the only worker in my environment. Also I can see in my log table that the runbook is resumed on the SMA02 worker.
A couple of comments around the script, first, in the script I have hardcoded https://wap01 , WAP01 is my SMA web service. The SMA management pack discover this component so the script could find the web service hostname from that discovery, it could also find the workers based on default discoveries in the SMA management pack. Second, if a worker goes offline this script will exclude it. But the script will not include the worker again when it comes back online. That has to be done manual or with a updated version of the script.
Note that this is provided “AS-IS” with no warranties at all. This is not a production ready solution, just an idea and an example.