I have built a scenario where I use Opalis, Virtual Machine Manager, Operations Manager and Service Manager to control number of web servers in a web farm. The web farm is behind a NLB cluster. The scenario was to only have enough with IIS running, no extra IIS machines that is not doing anything. If one IIS can handle the current load, there should only be one IIS online.
We use Service Manager to track what is happening. Opalis executes and Service Manager remembers. As soons as Opalis is done, it will drop everything, it only keeps it in memory during execution (and maybe some logs). Working with Service Manager will give us a great way to track everything.
Operations Manager are monitoring my IIS machines, if IIS01 is running low on resources an alert will be raised. If any other IIS in the web farm is idle another alert will be raised. When an alert about high load on IIS01 is raised the first Opalis policy starts. When an alert about a idle IIS is raised the “shut down” policy will start.
The purpose of this policy is to start a extra IIS, that is already in the web farm. The IIS will then relieve pressure from IIS01 that is running low on resources.
- The first activity monitor Operations Manager for new alerts about IIS01 running low on resources
- Set resolution state of the alert in Operations Manager to a Opalis resolution state. This makes sure no one else picks it up, instead you can see that Opalis is working on it in the Operations Manager console
- We use a counter to decide which IIS to start, this activity reset the counter to 0
- This activity gets all running machines from Virtual Machine Manager that is named something with IIS0. I have three IIS, named IIS01, IIS02 and IIS03.
- For each running IIS machine, add one to the counter (+1)
- The merge step is used to merge multiple treads, we don’t want to run the rest of the policy more the once. If step 4 returns multiple IISs a Junction activity is a good way to terminate parallel treads. One thing to remember here is that the merge step don’t publish any data from activities before it, so that is why we need to read the counter again.
- Reads counter value
- Add one (+1) to the counter. Each running VM added one to the counter and we want to start the next free IIS. If there is only one IIS running the value will be “1” then we add “1” and gets “2”. In step 9 we use that “2” to start IIS02
- Gets VM with name IIS0<Output from 8>, for example IIS02
- Create a change request in Service Manager saying the web farm is running low on resources and we need to add a extra IIS. We also include that we need to start the machine that we picked up in step 9
- Start the virtual machine. If it fails we will update set the change request to failure and add information around it, also generate an alert in Operations Manager
- Wait four minutes to make sure the virtual machine is up and running
- Trigger a policy to remove maintenance mode from Operations Manager
- Wait three minutes to make sure the maintenance mode is stopped
- Check if the web page is working on the IIS machine. If it fails we will set the change request to failure and add information around it, also generate an alert in Operations Manager
- Update the change request with result and change it to completed
The Stop Maintenance Mode policy is used to stop maintenance mode for machines that was shut down by Opalis earlier. This policy check if the affected machine is in maintenance mode, by checking Windows Computer (1), Health Service (2) and Health Service Watcher (4). We use a SQL query to get the agent watcher ID (3). These are the three objects that Opalis puts into maintenance mode when it shuts down a IIS in this example. Another example of Opalis and maintenance mode here.
The stop IIS policy will look for an alert in Operations Manager saying a IIS is idle, it will then shut down the IIS. In one version of this policy I added a run command activity that drain the IIS first from active sessions before shutdown.
- Monitor for an alert saying that a IIS is idle
- Set resolution state of the alert in Operations Manager to a Opalis resolution state. This makes sure no one else picks it up, instead you can see that Opalis is working on it
- Create a change request in Service Manager saying we will shut down a IIS, including name and reason
- Get VM to shutdown
- Put the Windows Computer into maintenance mode
- Put the Health Service into maintenance mode
- Query the OperationsManager database for the agent ID
- Put the agent watcher into maintenance mode
- Wait three minutes to make sure the maintenance mode is active
- Shutdown the machine (VM)
- Wait four minutes to make sure the machine is down
- Verify that the machine is down, else update the change request with a status equals failure and generate an alert in Operations Manager
- Update the change request with success and set status to completed
- Close the alert in Operations Manager
A couple of pictures of change requests in Service Manager
In this scenario and example I used two unit monitors in Operations Manager to trigger on a performance counter, to decide if IIS01 was running low on resource or if another IIS was idle. As the NLB will decide the load equal between my IIS machines I only measure load on IIS01, if I had two IIS online and IIS01 was low on resources IIS02 was that too.
In the first policy there should be a thread checking if already all IIS is running, then create a change request saying we need more IIS machines in the web farm. Or trigger another policy to create a new VM, configure it and include it in the web farm.
This export file have meet “mr Wolf” so it should not contain any unnecessary settings or objects.