Home » System Center Operations Manager 2007

Category Archives: System Center Operations Manager 2007

Contoso.se

Welcome to contoso.se! My name is Anders Bengtsson and this is my blog about Azure infrastructure and system management. I am a senior engineer in the FastTrack for Azure team, part of Azure Engineering, at Microsoft.  Contoso.se has two main purposes, first as a platform to share information with the community and the second as a notebook for myself.

Everything you read here is my own personal opinion and any code is provided "AS-IS" with no warranties.

Anders Bengtsson

MVP
MVP awarded 2007,2008,2009,2010

My Books
Service Manager Unleashed
Service Manager Unleashed
Orchestrator Unleashed
Orchestrator 2012 Unleashed
OMS
Inside the Microsoft Operations Management Suite

Last 30 Minutes Performance Data

I wrote a SQL query that I thought I could share. This SQL query will show collected performance data for a specific machine for the last 30 minutes. I was working with a issue where some agents stopped sending performance data. Before we found the root cause and a fix, we configured this script in a monitor to give us an alert if the agent was not sending performance data. Info how to configure a monitor to run a SQL query can be found here.

select Path, ObjectName, CounterName, InstanceName, SampleValue, TimeSampled 
from PerformanceDataAllView pdv with (NOLOCK)
inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid
inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId
where path = 'dc01.contoso.local' AND (TimeSampled < GETUTCDATE() AND TimeSampled > DATEADD(MINUTE,-30, GETUTCDATE()))
order by timesampled DESC

 

Building groups in Operations Manager, with a bit of Orchestrator magic

In many scenarios you have a list of servers, a database query result or a place in your Active Directory that contains servers that you want to monitor in some special way. Often you need the machines in a group in Operations Manager so you can for example create overrides, maintenance mode and views for that group. It is a pretty boring work to build the group manually and then to keep the group updated.

A way to get the Operations Manager group in sync with the machine list is to use a runbook that creates a management pack including a group based on the list. This set of example runbooks reads a list of machine, creates a management pack with a group that includes the machines. The list of servers could be generated by another runbook or another tool. The last runbook also imports the management pack into Operations Manager.

This first runbook execute the following steps. In general this runbook checks if the machines in the list has a Operations Manager agent, if they are monitored by Operations Manager

  1. Delete File. Deletes old Machines_IDS.txt file if it exists. Machines_IDS.txt is used later in the runbook and needs to be blank before we begin
  2. Get Lines. Read all lines in the list. The list is simple a text file with servers, one server per row
  3. Get Monitor. Check if Operations Manager have a Microsoft.Windows.Computer monitor for the servers in the text file
  4. Append Line. For each machine that has a monitor, we write the machine name to a temporary file. This is the same file as step one deleted any old version of
  5. Junction. We merge multiple threads together
  6. Invoke Runbook. Trigger next runbook
The second runbook executes the following steps. In general it builds the management pack file in XML
  1. Delete File. Deletes old MP files
  2. Modify Counter. We use a counter to keep track of the management pack version number. This step adds one to that counter value
  3. Get Counter Value. Get the counter value for the same counter as in step 2
  4. Append Line. This steps writes the first half of the XML code that needs to be in the management pack. The GroupInstanceID is a random ID that the Operations Manager console generated when I test created a group in the console. You could replace that and all the other names in the management packs.
  5. Read Line. This step reads every machine that we wrote in the machine list in the first runbook, step 4,
  6. Append Line. This steps writes all the machines from step 5 into the management pack file
  7. Junction. We merge multiple threads together
  8. Append Line. Writes the end of the management pack, some more XML
  9. Invoke Runbook. Starts the last runbook and pass the path to the management pack file
The last runbook inports the management pack file into Operations Manager

The result is that each time you run this set of runbooks they will generate a new management pack version with a group that includes all the machines from your list, that has a agent. The management pack is imported into Operations Manager and you can use the updated group. You could include a step to seal the management pack too. You can download my runbook example here, 20120410_GroupSync_WOLF.  Please note that this is provided “as is” with no warranties at all.

Ubuntu Server in Operations Manager

Earlier this week I did some tests around Ubuntu Server and Operations Manager 2012. I did the same in Operations Manager 2007 R2 and the way to get the monitoring to work is almost the same in both products. All challenges that I meet was the same in both products. Before we continue I would like to remind you that Ubuntu is not support by Microsoft in Operations Manager 2007 or 2012. The management pack and the agent I am using is community projects and is not supported either.

I installed a X86 Ubuntu Server version 10.04.3. I configured it with a static IP-number (sudo vi / etc/network/interfaces (you might need to remove DHCP client to get that setting static sudo apt-get remove dhcp-client)), DNS settings (sudo vi / etc/resolv.conf) and restarted networking (sudo / etc/init.d/networking restart). Note there is a space in front of etc, due to some security setting in the blog platform:) )

If you are in a sandbox and don’t care about the firewall you can disable it by running sudo ufw disable. I would not recommend that for production servers but I would not recommend using a un-supported agent either 🙂

The first discovery result in this error

Second try, after updated the forward and reverse DNS zones, result in this error

As I didn’t had a management pack for Ubuntu or a Ubuntu agent I thought that could be a good next step. There is a Ubuntu agent and a Ubuntu management pack at Codeplex that you can download and extract. You will notice there is two GetOSVersion.sh files, according to instruction at Codeplex you should use these files and replace the default file on your management server (C:\Program Files\System Center Operations Manager 2012\Server\AgentManagement\UnixAgents). Operations Manager copies this file over to the Linux/UNIX machine (/ tmp/scx-username) during discovery and executes the script. The script will get what kind of Linux/UNIX it is and report back to Operations Manager, that decides if it has a management pack or not for the version. The challenge is that we had two files, to decide which one to use you can copy them over to your Ubuntu machine and manually run the, You will then see that only the GetOSVersion.sh that came with the management pack returns valid XML. In other word copy the GetOSVersion.sh from the agent folder to your UnixAgents folder on the management server.

After that I still had some problem with the discovery, so I installed the agent manually on the Ubuntu machine (sudo dpkg -i scx-1.0.4-265.Ubuntu.10.x86.deb) and restarted the server (sudo reboot). After reboot I verified that Microsoft SCX CIM server was running (ps –ef|grep scx).

Then I ran the discovery again and a new error showed up. As you can see in the picture below there seems to be a problem with the certificate that the Ubuntu machine is trying to use. Normally the Linux machine will get a certificate signed by the management server, during the discovery. But in this example we installed the agent manually so the certificate is self-signed by the Ubuntu machine. If you copy the certificate file (/ etc/opt/microsoft/scx/ssl/scx-host-ubuntu02.pem) to a Windows machine and rename it to .cer you can open it and look at it. To solve this certificate issue, copy the certificate from your Linux box to your management server, run scxcertconfig -sign scx-host-<hostname>.pem scx_new.pem. Then rename scx_new.pem to the name of your Linux generated certificate and replace it on your Linux box. Restart the SCX service (sudo scxadmin -restart).

After that the discovery worked fine and the Ubuntu machine showed up healthy in the console. Don’t forget to configure accounts and profiles for your Ubuntu machine

 Please note that this is un-supported by Microsoft and provided “as is” with no warranties at all.

Maintenance Mode Report (part II)

In the Notification and reporting for maintenance mode post we created a report for every object that is in maintenance mode. I did a update to that script today, instead of showing all objects that are in maintenance mode the report now only show computer objects. You can download the script MMReport.txt (rename to .ps1). As you can see on the last two lines in the script, the script is stopping it self. These lines are needed it you want to run the script from Orchestrator and the “Run Program” activity, else the activity will not finish and move on in the runbook.

$objCurrentPSProcess = [System.Diagnostics.Process]::GetCurrentProcess();
Stop-Process -Id $objCurrentPSProcess.ID;

If you want to run this in Orchestrator, for example every 15 minutes to generate a update maintenance mode report, you can use a “Monitor Date/Time” activity and then a “Run Program” activity. You can configure the “Run Program” activity with the following settings

  •  Program execution
  • Computer: FIELD-SCO01 (name of a suitable server with Operations Manager shell installed)
  • Program path: powershell.exe
  • Parameters: -command C:\scripts\MMreport.ps1
  • Working folder: (no value)

Remember that your Orchestrator runbook server service account needs permissions in Operations Manager to get the info. With this sample script the output file will be C:\temp\MMreport.htm. Thanks to Stefan Stranger for PowerShell ideas.

Please note that this is provided “as is” with no warranties at all.

Forward Alerts by E-mail

In some scenarios you want to forward an alert to a engineer direct from the Operations Manager console. In this post I will show you a example how that can be done with a task. The task will run a power shell script that picks up properties of the alert and forward it by e-mail. Start by copy the script, mailforward.ps1, to C:\scripts on the machine running the console. The create the console task,

  1. In the Operations Manager console, navigate to Authoring > Tasks and create a new task
  2. Create a Console Tasks/Alert command line. Select a destination management pack and click Next
  3. On the General Properties page, input a task name, for example “Forward by e-mail”
  4. On the Command Line page, input
    • Application: %SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe
    • Parameters: C:\scripts\mailforward.ps1 ‘$Name$’ ‘$Description$’ ‘$Managed Object Name$’
    • Working directory: C:\
    • Check “Display output when this task is run”
  5. Click OK to save the task

Now, when you select an alert you can see your new task in the actions pane, and if you click it you can input a note and a recipient, in the task output your will see the complete e-mail that is sent.

Note that in this example it is only the e-mail alias that you need to input, not the complete e-mail address. If you need to input a complete e-mail address you will need to update the powershell script. You also need to update the script with your e-mail domain, from address and mail server. You can download the script here, mailforward. Place it in C:\scripts on each machine that is running the console, or on a shared disk. Make sure that your task is using the correct path to the script. Do not forget to allow your console workstation to send e-mail. with your mail server

Please note that this is provided “as is” with no warranties at all.

Windows Computer and associated Health Service watcher in a dynamic group

A common scenario is that you want to group computers in groups based on SLA, responsibility or server teams. Groups are used to for example filter views and boundary in notifications. When looking at the Windows computer object you see everything you are monitoring on that machine (that rolls-up health to the Windows computer object). But you don’t get heartbeat missing alerts, as these are generated by the health service watcher for that machine. In the console you can build a group and dynamically include windows computer objects and health service objects. But an challenge is that for the health service objects there are not many attributes to filter on, and there is no feature in the console to do “put these machines and associated health service watchers in this group”. The Windows Computer class we can extend with new attributes and it also includes a lot of attributes out of the box, but we cant do the same with health service. Building a suitable dynamic formula for the Windows Computer class is often not a challenge, the issue is to get associated health service watcher in there.

In this post I will group Windows computer objects and associated health service watcher together in a group, based on a registry string. The goal is to build one dynamic group for each SLA level. Each group should contain all machines that has that SLA level and also associated health service watchers. These groups can later be used of views and notification. I will start by extending the Windows Computer class with a new attribute. This attribute will be populated with a register string from each machine.

 To extend the Windows Class to discover a SLA string follow these steps

  1. In the Operations Manager Console navigate to Authoring, Management Pack Objects, Attributes
  2. Right-click Attributes and select Create a New Attribute…
  3. In the Create Attribute Wizard, General Properties, input a name, for example Contoso – SLA attribute
  4. In the Create Attribute Wizard, Discovery Method, select Registry as discovery type. Select Windows Computer as Target. Select suitable management pack or create a new management pack, for example Contoso – SLA
  5. In the Create Attribute Wizard, Registry Probe Configuration, input a path to the registry key and change attribute type to string.
  6. In production you should not run the discovery to often, often per 12 hour or 24 hours is fine. In a sandbox you could change it a bit to avoid waitning.
  7. Click Finish and close the Create Attribute Wizard
  8. Navigate to the Monitoring workspace. Create a new state view in the same management pack as you stored the new attribute. Configure the state view to show data related to Windows Computer_Extended, in my example from the Contoso – SLA management pack. Make sure you select to display the SLA attribute, Contoso – SLA Attribute, on the Display tab.
  9. Verify that you see your servers  and a value in the Contoso – SLA column

The next step is to create a dynamic group, that includes all windows computers with Gold as SLA.

  1. Navigate to the Authroing workspace, select Groups, right-click and select Create a new Group
  2. In the Create Group Wizard, General Properties, input a name, for example Contoso – Gold Servers. Select the same management pack as your attribute
  3. In the Create Group Wizard, Dynamic Members, click Create/Edit rules
  4. In the Create Group Wizard – Query Builder, input the same settings as in the image below
  5. In the Create Group Wizard, Exclude Members, click Create
  6. Right-click the new group, View Group Members, verify that all your gold servers are in the group

The next step is a bit more complicated. We now need to export the new management pack, edit the XML code and import it again. Export the management pack under Management Packs in the Administration workspace. Make an extra copy of the XML file to make sure you have a backup. Open the exported management pack, the XML file, and search for <MembershipRules>.  We need to add a second membership rule that groups associated health service watchers with computers already grouped by first membership rule. This second membership rule will add Health Service Watcher objects associated with a computer that is contained by this group. The management pack should look like this when you export it

we add the second discovery rule (see attached MP for code)

Save the management pack and import it into Operations Manager again. Repeat all steps for servers with silver SLA level. After you have imported the management pack, wait a couple of minutes (to let RMS re-calc group members) and then look at the members in the group.

You know have two groups, one for servers with silver SLA and one for servers with gold SLA. Both groups contains windows computer objects and health service watcher objects. You can use your new groups for example notification and views.You can download my MP here, Contoso.SLA  , please note that this is provided “as is” with no warranties at all. Thanks to Steve for ideas.

Dynamic web farm with Opalis, VMM, OPSMGR and SCSM

I have built a scenario where I use Opalis, Virtual Machine Manager, Operations Manager and Service Manager to control number of web servers in a web farm. The web farm is behind a NLB cluster. The scenario was to only have enough with IIS running, no extra IIS machines that is not doing anything. If one IIS can handle the current load, there should only be one IIS online.

We use Service Manager to track what is happening. Opalis executes and Service Manager remembers. As soons as Opalis is done, it will drop everything, it only keeps it in memory during execution (and maybe some logs). Working with Service Manager will give us a great way to track everything.

Operations Manager are monitoring my IIS machines, if IIS01 is running low on resources an alert will be raised. If any other IIS in the web farm is idle another alert will be raised. When an alert about high load on IIS01 is raised the first Opalis policy starts. When an alert about a idle IIS is raised the “shut down” policy will start.

The purpose of this policy is to start a extra IIS, that is already in the web farm. The IIS will then relieve pressure from IIS01 that is running low on resources.

  1. The first activity monitor Operations Manager for new alerts about IIS01 running low on resources
  2. Set resolution state of the alert in Operations Manager to a Opalis resolution state. This makes sure no one else picks it up, instead you can see that Opalis is working on it in the Operations Manager console
  3. We use a counter to decide which IIS to start, this activity reset the counter to 0
  4. This activity gets all running machines from Virtual Machine Manager that is named something with IIS0. I have three IIS, named IIS01, IIS02 and IIS03.
  5. For each running IIS machine, add one to the counter (+1)
  6. The merge step is used to merge multiple treads, we don’t want to run the rest of the policy more the once. If step 4 returns multiple IISs a Junction activity is a good way to terminate parallel treads. One thing to remember here is that the merge step don’t publish any data from activities before it, so that is why we need to read the counter again.
  7. Reads counter value
  8. Add one (+1) to the counter. Each running VM added one to the counter and we want to start the next free IIS. If there is only one IIS running the value will be “1” then we add “1” and gets “2”. In step 9 we use that “2” to start IIS02
  9. Gets VM with name IIS0<Output from 8>, for example IIS02
  10. Create a change request in Service Manager saying the web farm is running low on resources and we need to add a extra IIS. We also include that we need to start the machine that we picked up in step 9
  11. Start the virtual machine. If it fails we will update set the change request to failure and add information around it, also generate an alert in Operations Manager
  12. Wait four minutes to make sure the virtual machine is up and running
  13. Trigger a policy to remove maintenance mode from Operations Manager
  14. Wait three minutes to make sure the maintenance mode is stopped
  15. Check if the web page is working on the IIS machine. If it fails we will set the change request to failure and add information around it, also generate an alert in Operations Manager
  16. Update the change request with result and change it to completed

The Stop Maintenance Mode policy is used to stop maintenance mode for machines that was shut down by Opalis earlier. This policy check if the affected machine is in maintenance mode, by checking Windows Computer (1), Health Service (2) and Health Service Watcher (4). We use a SQL query to get the agent watcher ID (3). These are the three objects that Opalis puts into maintenance mode when it shuts down a IIS in this example. Another example of Opalis and maintenance mode here.

The stop IIS policy will look for an alert in Operations Manager saying a IIS is idle, it will then shut down the IIS. In one version of this policy I added a run command activity that drain the IIS first from active sessions before shutdown.

  1. Monitor for an alert saying that a IIS is idle
  2. Set resolution state of the alert in Operations Manager to a Opalis resolution state. This makes sure no one else picks it up, instead you can see that Opalis is working on it
  3. Create a change request in Service Manager saying we will shut down a IIS, including name and reason
  4. Get VM to shutdown
  5. Put the Windows Computer into maintenance mode
  6. Put the Health Service into maintenance mode
  7. Query the OperationsManager database for the agent ID
  8. Put the agent watcher into maintenance mode
  9. Wait three minutes to make sure the maintenance mode is active
  10. Shutdown the machine (VM)
  11. Wait four minutes to make sure the machine is down
  12. Verify that the machine is down, else update the change request with a status equals failure and generate an alert in Operations Manager
  13. Update the change request with success and set status to completed
  14. Close the alert in Operations Manager

A couple of pictures of change requests in Service Manager

In this scenario and example I used two unit monitors in Operations Manager to trigger on a performance counter, to decide if IIS01 was running low on resource or if another IIS was idle. As the NLB will decide the load equal between my IIS machines I only measure load on IIS01, if I had two IIS online and IIS01 was low on resources IIS02 was that too.

In the first policy there should be a thread checking if already all IIS is running, then create a change request saying we need more IIS machines in the web farm. Or trigger another policy to create a new VM, configure it and include it in the web farm.

This export file have meet “mr Wolf” so it should not contain any unnecessary settings or objects.

You can download my policies here, ZIP file, please note that this is provided “as is” with no warranties at all. Also please read this blog post about export and import of policies.

Get Old Operations Manager Alerts with Opalis

I read a question on the forum about the “Get Alert” object in Opalis, that it doesn’t support relative dates. That is correct and a bit sad too, it would be really nice to say “now minus 7 days” as we can do in the Operations Manager reporting console for example.

But there is of course a solution to this 🙂 You can start with a Format Date/Time object in your workflow, that will generate the relative date for you. The output can then be used as input in the Get Alerts object.

The Format Date/Time object takes a variable as input, the variable is the current time in yyyy-MM-dd h:m:s format. The Format Date/Time object then re-format the time and adjust the output date with minus 7 days.

There is a junction object in the policy too. This is used to make sure following objects only run once, regardless of the data provided in previous objects. Else “Send Platform Event” and ”Delete temp file” would run once for every alert the Get Alert object returns. Instead I use a Append File object to write all alerts to a temp text file. On the other side of the junction object I pick up all the data again with the Get Lines object, the rest of the policy will then only run once. You can download my example here, OldAlerts

Please note that this is provided “as is” with no warranties at all.

Start Maintenance Mode with Opalis

In this post I want to show you a example how you can start Operations Manager maintenance mode from Opalis. Operations Manager maintenance mode is used to prevent alerts and notifications from objects that you under maintenance. In maintenance mode, alerts, notifications, rules, monitors, automatic responses, state changes, and new alerts are suppressed at the agent. By design, Operations Manager 2007 monitor that the agent is functioning correctly even if the computer is in maintenance mode. If the Health Service and the Health Service Watcher for the agent are not in maintenance mode when you reboot the machine, you will get an alert saying heartbeat failure and failed to connect to computer. This example will put the Windows computer, the health service and the health service watcher into maintenance mode.

The policy contains a number of objects
1. Custom Start.
2. Start maintenance mode for a windows computer
3. Start maintenance mode for a health service
4. Query the Operations Manager database to get the computer GUID
5. Start maintenance mode for a health service watcher
6. Generate a platform event including a summary

The Start Maintenance Mode object puts an monitor in Operations Manager into maintenance mode. You can use the object to browse for a object.

To put the health service watcher into maintenance mode we need the GUID of the machine. The other two “start maintenance mode” objects are a bit easier as we can input the server FQDN. To get the server GUID we run a query against the OperationsManager database.

As you can see in the picture, the database query returns a bit more than the GUID. To filter out everything except the GUID we will use two of the data manipulation functions that Opalis have.

We first split the result from the database query into two parts, split by the “;”. Then from the second part, in this example {B3278151-9AC8-5B3B-8924-5F1F7CE27DE7}, we use the MID feature and tells Opalis to get 36 characters starting at position 2. The result when we run this will be three maintenance modes, as shown in the picture below

Because Operations Manager 2007 polls maintenance mode settings only once every 5 minutes, there can be a delay in an object’s scheduled removal from maintenance mode. You can download my example here, 20110211_MM . Remember that you need to edit the Query database object to configure which account to use when query the database.

Please note that this is provided “as is” with no warranties at all.

Deploy OPSMGR agent to untrusted zones with Opalis

When the agent is located in a domain separate from the domain where the Operations Manager management server is located, and no two-way trust exists between the two AD forests, certificates must be used so that authentication can take place between the agent and management server. A gateway server could also be included in a solution to a scenario like that. To configure a agent to authenticate with certificate there is a number of steps to carry out. I have a couple of blog posts around that here, here and here. As you can see it is a pretty complicated process and easy that you miss a step or something is not configure in the correct way. A solution to that could be to use a Opalis workflow. Opalis will then carry out all the steps for you, and in the same way every time. In this blogpost I will show you a workflow like that.

As you can see in the picture the workflow is devided into a number of policies. When you are building larger and complex policies it is a good practice to break it down to smaller parts. You can then also call the different parts from different policies and re-use your policies in different scenarios. I tried to put all info that I will change often in variables, for example domain name, shared folder path and CA name. It is much easier to change one variable then change configuration of 10 objects. The following list will give you a overview of each policy in the workflow. Note that it is only variables starting with 4.X that this workflow use.

All the variables

  • 4.1 is the main policy, the one that will trigger the other ones. It starts with creating a sub-folder in a shared folder. This folder is used for all kind of file transfer between management server, CA, Opalis and the agent. The 4.1 policy also includes two objects in the end that delete temporary folders on all machines that has been involved.
  • 4.2 is used to verify name resolution between the Opalis server and the agent.
  • 4.3 is used to install the CA root certificate on the agent. I presuppose that the root CA is already trusted by the Operations Manager management server. The policy also presuppose that the root CA is in the shared folder.
  • 4.4 generate a certificate request file and copy it to the shared folder. The file is generate on the agent. The shared folder is a folder on the network that all involved machines can access. It is important to make sure all the involved accounts have read and writte permissions to this folder.
  • 4.5 Copy the certificate request file from the shared folder to the CA. It submitts the request and receives a certificate (.CER). The certificate is then copied over to the shared folder. This step presuppose that the CA autoapprove the certificate. I dont want to include any manually steps, so a auto approving CA was a need. You can configure your CA to only auto approve based on templates used, more info about that here.
  • 4.6 Copy the certificate from the shared folder to the agent. It then adds the certificate to the local certificate store
  • 4.7 Copy the agent files from the shared folder to the agent. Installs the agent and verify that the Operations Manager agent service is running on the machine
  • 4.8 Configure Operations Manager to use the certificate and restarts the Operations Manager service

This is the shared folder before deploying any agents. the folder includes a sub-folder with agent installation files, in my example is the AMD64 folder renamed to Agent. The shared folder also includes the CA root certificate and a powershell script. The powershell script is used in policy 4.8. It includes on line

Get-ChildItem cert:\LocalMachine\My | where-object {$_.Issuer -eq “CN=skynet-DC01-CA, DC=skynet, DC=local”} | foreach {$_.SerialNumber} | out-file C:\temp_scom\cert.txt

The powershell command will get the serial number of the agent certificate. We will need to write this to the registry of the machine so the Operations Manager agent know which certificate to use. As you can see the command list all certificates issued by a specified CA, skynet-DC01-CA. It then writes the serial number to C:\temp_scom\cert.txt. If you have multiple certificates installed from the CA you will need to add a couple of criteria, so filter the correct certificate out.

The workflow includes a total of eight policies. We will now go into each one of them a bit deeper.

The 4.2 simple verify that the Opalis machine can get a IP of the target machine. If this is not working, nothing else in the workflow will work. It is always a good idea to start by checking all dependencies in your workflow, before you start changing anything. A idea could also be to add more tests to test that all involved accounts can write on the correct machines and folders.

The 4.3 policy starts with a creation of a new folder on the agent, the target machine. This folder, default C:\temp_scom, will be used as temporary area for all files the workflow copy or generate. The second object is a  file copy object. It is the root CA certificate that is being copied from the shared folder on the network to the agent. The last two objects first insert the certificate to the store and then adds it as a Trusted Publisher. Note that some of the “run program” or “run command” object will run until they timeout and is stopped, that will generate a warning but the policy will continue.

 The 4.4 policy generates a certificate request file on the agent. It dose this by first writing a INF file and then using Certreq create a new request from an .inf file. The policy then copy the request file over to the shared folder (the .req file).

The 4.5 policy start by creating a temporary folder on the CA. It then copy the certificate request from the shared folder to the temporary folder. Then with the CertReq command the certificate request is submitted to the CA. As I have configured the CA to auto approve requests the CertReq will also save the new certificate direct. The last object copy the new agent certificate to the shared folder.

The 4.6 policycopy the new agent certificate from the shared folder to the agent machine. It the adds the certificate to the local certificate store.

The 4.7 policy includes a number of steps. It start with creating a folder on the target machine for the agent installation files, default C:\temp_scom\agent. It then copies the agent installation files from the shared folder to the new temporary file.

The 4.8 policy start by copy the getserial.ps1 script from the shared folder to the agent. This script export the serial number of the new agent certificate. The second object runs this powershell script. The next two steps reads the serial number from the text file that the powershell script generated, and write it as a platform notification. Next step add the serial number to the register in the correct order. The Operations Manager agent service is then restarted.

That was all policies included in the workflow. Some minutes after this the target machine will show up in Operations Manager. In most environments it will show up under pending management (configure it at Administration/Global Settings/Security) and a Operations Manager administrator needs to approve it. This blog posted showed you one way to use Opalis together with Operations Manager, when deploying agents to machines in untrusted environments. A task that can be pretty complicated a includes a lot of steps. With Opalis you simple include a target machine name and click Start 🙂

For ideas and info how to build your workflow fault-tolerance, please read this post. It could also be a idea to add some more platform event objects or write to logfile objects, to get some info from the workflow. Make sure that you have a unrestricted executionpolicy on your target machine, so the getserial.ps1 script can run. Make sure no firewall is blocking the traffic and also that the target machine have powershell installed. Also, spend a couple of minutes to make sure all involved accounts have access to write and read to the shared folder. If you want to download the workflow click 4 SCOM Agent 2.

Please note that this is provided “as is” with no warranties at all.