Contoso.se

Welcome to contoso.se! My name is Anders Bengtsson and this is my blog about Azure infrastructure and system management. I am a senior engineer in the FastTrack for Azure team, part of Azure Engineering, at Microsoft.  Contoso.se has two main purposes, first as a platform to share information with the community and the second as a notebook for myself.

Everything you read here is my own personal opinion and any code is provided "AS-IS" with no warranties.

Anders Bengtsson

MVP
MVP awarded 2007,2008,2009,2010

My Books
Service Manager Unleashed
Service Manager Unleashed
Orchestrator Unleashed
Orchestrator 2012 Unleashed
OMS
Inside the Microsoft Operations Management Suite

Analyze and visualize Azure Firewall with Log Analytics View Designer

A colleague and I have put together a sample view for Log Analytic to analyze and visualize Azure Firewall logs. You can download the sample view here. The sample view will visualize data around application rule and network rule log data. With view Designer in Azure Log Analytics, you can create custom views to visualize data in your Log Analytics workspace, read more about View Designer here.

 

Monitor Linux Daemon with Log Analytics

In this blog post I would like to share an example of how daemons on Linux machines can be monitored with Log Analytics. Monitoring daemons are not listed as a feature direct in the Log Analytic portal, but it is possible to do. When a daemon is started or stopped a line is written in Syslog. Syslog is possible to read with the Microsoft Monitoring Agent and send to Log Analytics.

The only thing to configure is to enable collection of Syslog and the daemon facility.

If the daemon is stopped (the cron daemon in this example) the following lines are written to the syslog logfile

Soon after the same lines are written to Log Analytics as events in the Syslog table

You can now configure an alert including notification when the daemon stops. The alert can, for example, be visualized in Azure Monitor and sent by e-mail.

 

 

 

Disclaimer: Cloud is a very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production-ready solution for your production environment, just an idea, and an example.

Deploying a central auditing workspace with Log Analytics

One or multiple workspaces

A common question when talking Log Analytics design is one or multiple workspaces. Should there be one central workspace with all data? Should there be one workspace per application? Should there be one workspace for the auditing team? There are many different ideas and scenarios, but a common component is a central workspace for auditing. One workspace where a central team can track security-related events, assessments, and alerts.

The following topics are often involved when the decision to use one or multiple workspaces

  • In which region do we need to store the data, for example, the data must be stored within EU
  • Data Retention. The number of days to store the data is configured on a workspace level. That means that we will need to pay for the same retention setting on all our data within a workspace. If we have some data that needs to be stored for 7 days and some important data needs to be stored for 200 days, we will need to pay and store all data for 200 days.
  • Data Access. Today the workspace is the security boundary for Log Analytics. If we, for example, have log data from two different teams, that are not allowed to see each other’s data, we will need to store them in different workspaces.
  • Data Collection. Today there are solutions and data collection settings that are set on workspace level. For example, we enable collection of warnings in the Application log on Windows servers, it will be collected from all connected Windows servers, even if we only need it from some of our servers. This can affect the total cost in a negative way if collecting data not needed. In this scenario, it might be an idea to connect some servers to one workspace and others to another workspace.

When decided to use multiple workspaces it is possible to multi-home Windows servers to send data to multiple workspaces. For Linux servers and some other data sources, for example, multiple PaaS services can today send data to one workspace. One thing to note when configuring multi-home data sources is that if the same data is collected and inserted into multiple workspaces, we also pay for that data twice. In other words, it is a good idea to make sure that different kind of data is collected for each workspace, for example, audit data to one workspace and application logs to another.

The following figure describes a scenario where two application teams have their own workspaces, and there is one workspace for central auditing. The auditing team needs access to data from both service workspaces, to run analyzes and verify that everything is running according to company policies.

To deploy this scenario simple deploy three workspaces and give the central auditing team read permissions on each service workspace, see Microsoft Docs for more details.

Cross workspace queries

The next step is to start author queries to analyze and visualize the data. The data is stored in each service workspace so the Auditing team will need to use the cross-workspace query feature, read more about it (https://docs.microsoft.com/en-us/azure/log-analytics/log-analytics-cross-workspace-search). Data is only stored in the two service workspaces, there is no data in the central auditing workspace.

The following query is a cross workspace query example, query two workspaces and list failed logon events. In the query we use “isfuzzy” to tell Log Analytics that execution of the query will continue even if the underlying table or view reference is not present. We can also see the two workspace ID, one for each service workspace, and that we use the SecurityEvent table.

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).SecurityEvent, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).SecurityEvent
| where EventID == 4625 and AccountType == ‘User’
| extend LowerAccount=tolower(Account)
| summarize Failed = count() by LowerAccount
| order by Failed desc

The following example shows all failed security baseline checks for the two service workspaces

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).SecurityBaseline, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).SecurityBaseline
| where ( RuleSeverity == “Critical” )
| where ( AnalyzeResult == “Failed” )
| project Computer, Description

To make cross workspace queries a bit easier we can create a function. For example, run the following query and save it then as a function.

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).SecurityBaseline, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).SecurityBaseline

We can then call the function in our queries, for example, to get all failed security baseline checks. We don’t need to specify workspaces to join, as they are handled by the function.

ContosoSecEvents
| where ( RuleSeverity == “Critical” )
| where ( AnalyzeResult == “Failed” )
| project Computer, Description

Another way of using saved functions is the following example.
First, we have a saved function named ContosoCompMissingUpdates listing all computers that are missing updates.

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).Update, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).Update
| where UpdateState == ‘Needed’ and Optional == false and Classification == ‘Security Updates’ and Approved != false
| distinct Computer

We can then use the ContosoCompMissingUpdates function within a query showing machines with failed Security baseline checks. The result is a list of machines missing updates and with failed baseline checks.

ContosoSecEvents
| where ( RuleSeverity == “Critical” )
| where ( AnalyzeResult == “Failed” )
| where Computer in (ContosoCompMissingUpdates)
| project Computer, Description

 

Disclaimer: Cloud is a very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production-ready solution for your production environment, just an idea, and an example.

Exporting Azure Resource Manager templates with Azure Automation, and protecting them with Azure Backup

Earlier this week I put together a runbook to backup Azure Resource Manager (ARM) templates for existing Resource Groups. The runbook exports the resource group as a template and saves it to a JSON file. The JSON file is then uploaded to an Azure File Share that can be protected with Azure Backup.

The runbook can be downloaded from here, PS100-ExportRGConfig. The runbook format is PowerShell. The runbook might require an Azure PS module upgrade. I have noticed that in some new Azure Automation accounts, the AzureRM.Resources module doesn’t include Export-AzureRmResourceGroup and needs an update.

Inside of the runbook, you need to configure the following variables:

  • Resourcegrouptoexport , this is the Resource Group you would like to export to a JSON file.
  • storageRG, this is the name of the Resource Group that contains the file share you want to upload the JSON file to.
  • storageAccountName, this is the name of the storage account that contains the Azure file share.
  • filesharename, this is the name of the Azure file share in the storage account. On the Azure file share, there needs to be a directory named templates. You will need to create that directory manually.

When you run the runbook you might see warning messages. There might be some cases where the PowerShell cmdlet fails to generate some parts of the template. Warning messages will inform you of the resources that failed. The template will still be generated for the parts that were successful.

Once the JSON file is written to the Azure File Share you can protect the Azure file share with Azure Backup. Read more about backup for Azure file shares here.

Disclaimer: Cloud is a very fast-moving environment. It means that by the time you’re reading this post everything described here could have been changed completely. Note that this is provided “AS-IS” with no warranties at all. This is not a production ready solution for your production environment, just an idea, and an example.

“Argument is null or empty” error when running post-steps script in Azure Site Recovery

A couple of days ago I was working on Azure Site Recovery post-step scripts with my colleague Jonathan. The scenario was to fail over two virtual machines running in the West Europe Azure region to the North Europe Azure region. Enable replication between two Azure regions is not complicated, but not all components are supported to fail over between regions, for example, public IP addresses are not. To get all details about supported scenarios see the Azure Site Replication support matrix here. To set up network after failover we wrote an Azure Automation runbook and connected the runbook to our failover plan as a post-steps script. After Azure Site Recovery has run all fail failover steps it triggers the post-steps scripts. But we ran into some strange errors in our post step scripts and would like to share the solution with the community.

When doing test failover everything looked ok from the Azure Site Recovery perspective. Our two virtual machines failed over and the first script was triggered. The script will add public IP addresses to the two machines. 

But when looking in Azure Automation on the runbook job we could see that something was not working.


The runbook could not find the new virtual machine resources in the pre-created resource group. After a couple of different tests, we realized that the new Azure virtual machine resources in North Europe were not ready when the runbook was triggered by the recovery plan.

If we added a small delay on a couple of minutes in the script everything worked perfectly 🙂

Building an Azure dashboard with server performance data

I guess all of you have seen the dashboards in Azure, the first page when login to the Azure portal. In some case, there are some resources that you by mistake pin while deploying, and in some case, it is just blank. In this blog post, I would like to share how to build a simple server health dashboard with basic performance data from servers (CPU workload, free disk space, and free memory). To set up this we need to do three main tasks

  1. Connect Data sources to get data into Log Analytics
  2. Configure queries to collect the needed data
  3. Pin dashboard/view to Azure Dashboard

Connect Data Sources

If your servers are Azure virtual machines you can read some performance data (see image) direct from the VM using the Azure VM agent, but unfortunately nothing about free memory, CPU or free disk space.

To collect the required data we need to install an agent inside the OS. Azure Log Analytic (often called OMS) provides features for collect data and from different sources. Log Analytic also provide features around visualize and analyze the collected data.

In Log Analytics we first need to install the agent on all servers, more information about that here. Once all servers are connected to the workspace the next step is to start collection performance data. You can enable specific performance counters under Advanced settings / Data / Windows Performance Counters or Linux Performance Counters. In the image, it is Windows Performance Counters shown, but of course, you can also do this with Linux Performance Counters too. It is the same steps for Linux servers, install the agent and then enable performance counters.

In this example, we will add the following performance counters, and configure sample interval for every 10 seconds.

  • Memory(*)\Available MBytes
  • LogicalDisk(*)\Free Megabyte
  • LogicalDisk(*)\% Free Space
  • Processor(_Total)\% Processor Time

Building Queries

Next step is to configure queries to visualize the collected data. There are a lot of good information about building queries and working with performance data here and here. But to save you some time you can use the following queries as a foundation,

These queries show the average for each minute (1minutes), based on the data we collect every 10 seconds.

Disk, % Free Space. This query will show % free space on each logical disk that has an instance name that contains “:” (this filter out, for example, mount points volumes on DPM servers).

Perf | where ObjectName == “LogicalDisk” and CounterName == “% Free Space” | where InstanceName contains “:” | summarize FreeSpaceP = avg(CounterValue) by bin(TimeGenerated, 1minutes), CounterPath| sort by TimeGenerated desc | render timechart

Disk, Free Megabytes

Perf | where ObjectName == “LogicalDisk” and CounterName == “Free Megabytes” | where InstanceName contains “:” | summarize FreeSpaceMb = avg(CounterValue) by bin(TimeGenerated, 1minutes), CounterPath| sort by TimeGenerated desc | render timechart

Memory, Available MBytes

Perf | where ( ObjectName == “Memory” ) | where ( CounterName == “Available MBytes” ) | summarize FreeMemMb = avg(CounterValue) by bin(TimeGenerated, 1minutes), CounterPath| sort by TimeGenerated desc | render timechart

Processor, % Processor Time

Perf | where ( ObjectName == “Processor” ) | where ( CounterName == “% Processor Time” ) | where ( InstanceName == “_Total” ) | summarize CPU= avg(CounterValue) by bin(TimeGenerated, 1minutes), CounterPath| sort by TimeGenerated desc | render timechat

It can take some time before the first data is collected. If you don’t see any data when you run the queries, take another cup of coffee and try again a bit later 😊

Building a view

We now have all data sources connected and queries to visualize the data. The next step is to build views in Log Analytics. This is not requirements to build an Azure Dashboard but is nice to have.

Log Analytics View Design is a feature that we can use to build custom views. These views can later be pinned to the Azure Dashboard.  To save you some time you can download “Contoso Example Log Analytics Dashboard” (Contoso Servers) and import into View Designer.

Pin tiles to the Azure Dashboard – Log Analytics tile

There are two ways to pin a tile to the Azure dashboard that we will look at. The first one is to right-click a tile in Log Analytics and select pin to the dashboard. You can see this process in the following two images. On the Azure Dashboard you will see the view tile from the Log Analytics solution, if you click it you will go into Log Analytics and the specific solution. In this example, you can’t see processor, memory or disk performance direct on the Azure Dashboard.

Pin tiles to the Azure Dashboard – Advanced Analytics

The second alternative is to pin charts directly from the Advanced Analytics portal inside of Log Analytics. The Advanced Analytics feature provides advanced functionality not available in the Log Search portal, for example, Smart Analytics. In Log Analytics, click Analytics, to open up the Advanced Analytics portal. In the advanced analytics portal run the queries from this blog post, and click “Pin” on the right side. Once the different charts/queries are pinned to the Azure dashboard you can select them, click Edit, and change title and description on them.

Disclaimer: Cloud is very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production ready solution for your production environment, just an idea and an example.

 

Inside the Microsoft Operations Management Suite [e-book] version 2 now public

It took some time and a lot of writing, testing, editing, investigation, and discussion, but now it is finally here!

Download your copy here.

Big thanks for the discussions and collaboration to the authoring team, Tao Yang, Stanislav Zhelyazkov, the Pete Zerger and Kevin Greene.

Happy reading 🙂

Keep your Azure subscription tidy with Azure Automation and Log Analytics

When delivering Azure training or Azure engagements there is always a discussion about how important it is to have a policy and a lifecycle for Azure resources. Not only do we need a process to deploy resources to Azure, we also need a process to remove resources. From a cost perspective this is extra important, as an orphan IP address or disk will cost many, even if they are not in use. We also need policy to make sure everything is configured according to company policy. Much can be solved with ARM policies, but not everything. For example, you can’t make sure all resources have locks configured.

To keep the Azure subscription tidy and to get an event/recommendation when something is not configured correctly we can use an Azure Automation and OMS Log Analytics. In this blog post, I will show an example how this can be done 😊 The data flow is

  1. Azure Automation runbook triggers based on a schedule or manual. The runbook run several checks, for example if there are any orphan disks.
  2. If there is anything that should be investigated an event is created in OMS Log Analytics.
  3. In the OMS portal, we can build a dashboard to get a good overview of these events.

The example dashboard shows (down the example dashboard here)

  • Total number of recommendations/events
  • Number of resource types with recommendations
  • Number of resources groups with recommendations. If each resource group correspond to a service, it is easy to see number of services that are not configured according to policy

The runbook is this example checks if there are any disks without an owner, any VMs without automatically shut down, any public IP addresses not in use and databases without lock configured. The runbook is based on PowerShell and it is easy to add more checks. The runbook submit data to OMS Log Analytics with Tao Yang PS module for OMSDataInjection, download here. That show up in Log Analytics as a custom log called ContosoAzureCompliance_CL. The name of the log can be changed in the runbook.

The figure below shows the log search interface in the OMS portal. On the left side, you can see that we can filter based on resource, resource type, severity and resource group. This makes it easy to drill into a specific type of resource or resource group.

Disclaimer: Cloud is very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production ready solution for your production environment, just an idea and an example.

Process OMS Log Analytic data with Azure Automation

Log Analytic in OMS provides a rich set of data process features for example custom fields. But there are scenarios were the current feature set is not enough.

In this scenario, we have a custom logfile that log messages from an application. From time to time the log file contains information about number of files in an application queue. We would like to display number of files in queue as a graph in OMS. Custom Fields will not work in this scenario as the log entries has many different log entry formats, OMS cannot figure out the structure of the log entries when not all of them follow the same structure. OMS don´t support custom field based on a subquery of the custom log entries, which otherwise could be a solution.

The example (in this blog post) is to ship the data to Azure Automation, process it, and send it back in suitable format to Log Analytics. This can be done in two different ways,

  • 1 – Configure a alert rule in Log Analytics to send data to Azure Automation. Azure Automation process the data and send it to OMS as a new custom log
  • 2 – Azure Automation connect to Log Analytics and query the data based on a schedule. Azure Automation process the data and send it to OMS as a new custom log

It is important to remember that events in Log Analytics don´t have a ID. Either solution we choose we must build a solution that makes sure all data is processed. If there is an interruption between Log Analytics and Azure Automation it is difficult to track which events that are already processed.

One thing to note is that Log Analytic and Azure Automation show time different. It seems like Azure Automation use UTC when display time properties of the events, but the portal for Log Analytic (the OMS portal) use the local time zone (in my example UTC+2hours).  This could be a bit tricky.

1 – A Alert Rule push data to Azure Automation

In this example we need to do configuration both in Azure Automation and Log Analytics. The data flow will be

  • Event is inserted into Log Analytics
  • Event trigger Alert Rule in Log Analytics that trigger an Azure Automation runbook
  • Azure Automation get the data from the webhook and process it
  • Azure Automation send back data to Log Analytics as a new custom log

To configure this in Log Analytics and Azure Automation, follow these steps

  1. In Azure Automation, import AzureRM OperationalInsight PowerShell module. This can be done from the Azure Automation account module gallery. More information about the module here
  2. Create a new connection of type OMSWorkSpace in the in the Azure Automation account
  3. Import the example runbook, download from WebHookDataFromOMS
  4. In the runbook, update OMSConnection name, in the example named OMS-GeekPlayGround
  5. In the runbook, you need to update how the data is split and what data you would like to send back to OMS. In the example I send back Computer, TimeGenerated and Files to Log Analytic
  6. Publish the runbook
  7. In Log Analytics, configure an Alert Rule to trigger the runbook
  8. Done !

2 – Azure Automation query log analytic

In this example we don´t need to configure anything on the Log Analytic side. Instead all configuration is done on the Azure Automation side. The data flow till be

  • Events are inserted into Log Analytic
  • Azure Automation query Log Analytic based on a schedule
  • Azure Automation get data and process it
  • Azure Automation send back data to Log Analytic as a new custom log

To configure this in Azure Automation, follow these steps

  1. Import Tao Yang PS module for OMSDataInjection into your Azure Automation account. Navigate to PS Gallery and click Deploy to Azure Automation
  2. Import the AzureRM OperationalInsight PowerShell module. This can be done from Azure Automation account module gallery. More information about the module here.
  3. Create a new connection of type OMSWorkSpace in the Azure Automation account
  4. Verify that there is a connection to the Azure subscription that contains the Azure Automation account. In my example the connection is named “AzureRunAsConnection”
  5. Import the runbook, download here, GetOMSDataAndSendOMSData in TXT format
  6. In the runbook, update OMSConnection name, in the example named OMS-GeekPlayGround
  7. In the runbook, update Azure Connection name, in the example named AzureRunAsConnection
  8. In the runbook, update OMS workspace name, in the example named geekplayground
  9. In the runbook, update Azure Resource Group name, in the example named “automationresgrp”
  10. In the runbook, update the Log Analytic query that Azure Automation run to get data, in the example “Type=ContosoTestApp_CL queue”. Also update the $StartDateAndTime with correct start time. In the example Azure Automation collect data from the last hour (now minus one hour)
  11. In the runbook, you need to update how the data is split and what data you would like to send back to OMS. In the example I send back Computer, TimeGenerated and Files to Log Analytic.
  12. Configure a schedule to execute the runbook with suitable intervals.

Both solutions will send back number of files in queue as double data type to Log Analytic. One of the benefits of building a custom PowerShell object and convert it to JSON before submitting it to Log Analytic, is that you can easy control data type. If you simple submit data to Log Analytic the data type will be detected automatically, but sometimes the automatic data type is not what you except. With the custom PS object you can control it. Thanks to Stan for this tip. The data will be stored twice in Log Analytic, the raw data and the processed data from Azure Automation.

Disclaimer: Cloud is very fast moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production ready solution for your production environment, just an idea and an example.

Experts Live Europe

I will deliver a session at Experts Live Europe in Berlin, August 23-25. My session is “Lessons learned from Microsoft Premier Support – Manage and control your Azure resources” and  will show you how organizations manage and control their Microsoft Azure resources in real world situations 🙂

Experts Live Europe is one of Europe’s largest community conferences with an focus on Microsoft cloud, datacenter and workplace management. Top experts from around the world present discussion panels, ask-the-experts sessions and breakout sessions and cover the latest products, technologies and solutions. 

More info about the event and registration here.  See you there!