Home » Log Analytics (Page 2)

Category Archives: Log Analytics


Welcome to contoso.se! My name is Anders Bengtsson and this is my blog about Azure infrastructure and system management. I am a senior engineer in the FastTrack for Azure team, part of Azure Engineering, at Microsoft.  Contoso.se has two main purposes, first as a platform to share information with the community and the second as a notebook for myself.

Everything you read here is my own personal opinion and any code is provided "AS-IS" with no warranties.

Anders Bengtsson

MVP awarded 2007,2008,2009,2010

My Books
Service Manager Unleashed
Service Manager Unleashed
Orchestrator Unleashed
Orchestrator 2012 Unleashed
Inside the Microsoft Operations Management Suite

Inside Azure Management [e-book]

We are excited to announce the Preview release of Inside Azure Management is now available, with more than 500 pages covering many of the latest monitoring and management features in Microsoft Azure!

March 27, 2019] We are excited to announce the Preview release of Inside Azure Management is now available, with more than 500 pages covering many of the latest monitoring and management features in Microsoft Azure! 

This FREE e-book is written by Microsoft MVPs Tao Yang, Stanislav Zhelyazkov, Pete Zerger, and Kevin Greene, along with Anders Bengtsson.

Description: “Inside Azure Management” is the sequel to “Inside the Microsoft Operations Management Suite”, featuring end-to-end deep dive into the full range of Azure management features and functionality, complete with downloadable sample scripts.  

The chapter list in this edition is shown below:

  • Chapter 1 – Intro
  • Chapter 2 – Implementing Governance in Azure
  • Chapter 3 – Migrating Workloads to Azure
  • Chapter 4 – Configuring Data Sources for Azure Log Analytics
  • Chapter 5 – Monitoring Applications
  • Chapter 6 – Monitoring Infrastructure
  • Chapter 7 – Configuring Alerting and notification
  • Chapter 8 – Monitor Databases
  • Chapter 9 – Monitoring Containers
  • Chapter 10 – Implementing Process Automation
  • Chapter 11 – Configuration Management
  • Chapter 12 – Monitoring Security-related Configuration
  • Chapter 13 – Data Backup for Azure Workloads
  • Chapter 14 – Implementing a Disaster Recovery Strategy
  • Chapter 15 – Update Management for VMs
  • Chapter 16 – Conclusion

Download your copy here

Update Service Map groups with PowerShell

Service Map automatically discovers application components on Windows and Linux systems and maps the communication between services. With Service Map, you can view your servers in the way that you think of them: as interconnected systems that deliver critical services. Service Map shows connections between servers, processes, inbound and outbound connection latency, and ports across any TCP-connected architecture, with no configuration required other than the installation of an agent. Machine Groups allow you to see maps centered around a set of servers, not just one so you can see all the members of a multi-tier application or server cluster in one map. Source Microsoft Docs

A common question is how to update machine groups in Service Map automatically. Last week my colleague Jose Moreno and I was worked with Service Map and investigated how to automate machine group updates. The result was a couple of PowerShell examples, showing how to create and maintain machine groups with PowerShell. You can find all the examples on Jose GitHub page. With these scripts we can now use a source, for example, Active Directory groups, to set up and update machine groups in Service Map.

Building reports with Log Analytics data

A common question I see is how to present the data collected with Log Analytics. We can use View Designer in Log Analytics, PowerBI, Azure Dashboard, and Excel PowerPivot. But in this blog post, I would like to show another way to build a “report” direct in the Azure Portal for Log Analytics data.

Workbooks is a feature in Application Insights to build interactive reports. Workbooks are configured under Application Insights but it’s possible to access data from Log Analytics.

In this example, we will build a workbook for failed logins in Active Directory. The source data (event Id 4625) is collected by the Security and Audit solution in Log Analytics.

If we run a query in Log Analytics to show these events, we can easily see failed login reason and number of events. But we would also like to drill down into these events and see account names. That is not possible in Log Analytics today, and this is where workbooks can bring value.

Any Application Insights instance can be used; no data needs to be collected by the instance (no extra cost) as we will use Log Analytics as a data source. In Application Insights, there are some default workbooks and quick start templates. For this example, we will use the “Default Template.”

In the workbook, we can configure it to use any Log Analytics workspace, in any subscription, as a source. Using different workspaces for different parts of the workbook is possible. The query used in this example is shown below, note it shows data for the last 30 days.

| where AccountType == ‘User’ and EventID == 4625
| where TimeGenerated > ago(30d)
| extend Reason = case(
SubStatus == ‘0xc000005e’, ‘No logon servers available to service the logon request’,
SubStatus == ‘0xc0000062’, ‘Account name is not properly formatted’,
SubStatus == ‘0xc0000064’, ‘Account name does not exist’,
SubStatus == ‘0xc000006a’, ‘Incorrect password’,
SubStatus == ‘0xc000006d’, ‘Bad user name or password’,
SubStatus == ‘0xc000006f’, ‘User logon blocked by account restriction’,
SubStatus == ‘0xc000006f’, ‘User logon outside of restricted logon hours’,
SubStatus == ‘0xc0000070’, ‘User logon blocked by workstation restriction’,
SubStatus == ‘0xc0000071’, ‘Password has expired’,
SubStatus == ‘0xc0000072’, ‘Account is disabled’,
SubStatus == ‘0xc0000133’, ‘Clocks between DC and other computer too far out of sync’,
SubStatus == ‘0xc000015b’, ‘The user has not been granted the requested logon right at this machine’,
SubStatus == ‘0xc0000193’, ‘Account has expirated’,
SubStatus == ‘0xc0000224’, ‘User is required to change password at next logon’,
SubStatus == ‘0xc0000234’, ‘Account is currently locked out’,
strcat(‘Unknown reason substatus: ‘, SubStatus))
| project TimeGenerated, Account, Reason, Computer

In the workbook, on Column Settings, we can configure how the result will be grouped together. In this example, we will group by failed login reason and then account name.

When running the workbook, we get a list of failed login reasons and can expand to see account names and amount of failed logins. It is possible to add an extra filter to the query to remove “noise” for example accounts with less than three failed login events.
It is also possible to pin a workbook or part of a workbook, to an Azure Dashboard, to easily access the information.

In the workbook you can also add more text fields, metric fields and query fields, for example a time chart showing the amount of events per day.

Ingestion Time in Log Analytics

A common topic around Log Analytics is ingestion time. How long time does it take before an event is visible in Log Analytics?
The latency depends on three main areas agent time, pipeline time and indexing time. This is all described in this Microsoft Docs article.

In Log Analytics or Kusto, there is a hidden DateTime column in each table called IngestionTime. The time of ingestion is recorded for each record, in that hidden column. The IngestionTime can be used to estimate the end-to-end latency in ingesting data to Log Analytics. TimeGenerated is a timestamp from the source system, for example, a Windows server. By comparing TimeGenerated and IngestionTime we can estimate the latency in getting the data into Log Analytics. More info around IngestionTime policy here.

In the image below a test event is generated on a Windows, note the timestamp (Logged).

When the event is in Log Analytics, we can find it and compare IngestionTime and TimeGenerated. We can see that the difference is around a second. TimeGenerated is the same as “Logged” on the source system. This is just an estimate, as the clocks on the server and in Log Analytics might not be in sync.

If we want to calculate the estimated latency, we can use the following query. It will take all events and estimate the latency in minutes, and order it by latency.

| extend LatencyInMinutes = datetime_diff('minute', ingestion_time(), TimeGenerated)
| project TimeGenerated, ingestion_time(), LatencyInMinutes
| order by LatencyInMinutes

You can also summaries the average latency per hour, and generated a chart, with the following query. This is useful when investigating latency over a longer period of time.

| extend LatencyInMinutes = datetime_diff('minute', ingestion_time(), TimeGenerated)
| project TimeGenerated, ingestion_time(), LatencyInMinutes
| summarize avg(LatencyInMinutes) by bin(TimeGenerated, 1h)

Disclaimer: Cloud is a very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production-ready solution for your production environment, just an idea, and an example.

Analyze and visualize Azure Firewall with Log Analytics View Designer

A colleague and I have put together a sample view for Log Analytic to analyze and visualize Azure Firewall logs. You can download the sample view here. The sample view will visualize data around application rule and network rule log data. With view Designer in Azure Log Analytics, you can create custom views to visualize data in your Log Analytics workspace, read more about View Designer here.


Monitor Linux Daemon with Log Analytics

In this blog post I would like to share an example of how daemons on Linux machines can be monitored with Log Analytics. Monitoring daemons are not listed as a feature direct in the Log Analytic portal, but it is possible to do. When a daemon is started or stopped a line is written in Syslog. Syslog is possible to read with the Microsoft Monitoring Agent and send to Log Analytics.

The only thing to configure is to enable collection of Syslog and the daemon facility.

If the daemon is stopped (the cron daemon in this example) the following lines are written to the syslog logfile

Soon after the same lines are written to Log Analytics as events in the Syslog table

You can now configure an alert including notification when the daemon stops. The alert can, for example, be visualized in Azure Monitor and sent by e-mail.




Disclaimer: Cloud is a very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production-ready solution for your production environment, just an idea, and an example.

Deploying a central auditing workspace with Log Analytics

One or multiple workspaces

A common question when talking Log Analytics design is one or multiple workspaces. Should there be one central workspace with all data? Should there be one workspace per application? Should there be one workspace for the auditing team? There are many different ideas and scenarios, but a common component is a central workspace for auditing. One workspace where a central team can track security-related events, assessments, and alerts.

The following topics are often involved when the decision to use one or multiple workspaces

  • In which region do we need to store the data, for example, the data must be stored within EU
  • Data Retention. The number of days to store the data is configured on a workspace level. That means that we will need to pay for the same retention setting on all our data within a workspace. If we have some data that needs to be stored for 7 days and some important data needs to be stored for 200 days, we will need to pay and store all data for 200 days.
  • Data Access. Today the workspace is the security boundary for Log Analytics. If we, for example, have log data from two different teams, that are not allowed to see each other’s data, we will need to store them in different workspaces.
  • Data Collection. Today there are solutions and data collection settings that are set on workspace level. For example, we enable collection of warnings in the Application log on Windows servers, it will be collected from all connected Windows servers, even if we only need it from some of our servers. This can affect the total cost in a negative way if collecting data not needed. In this scenario, it might be an idea to connect some servers to one workspace and others to another workspace.

When decided to use multiple workspaces it is possible to multi-home Windows servers to send data to multiple workspaces. For Linux servers and some other data sources, for example, multiple PaaS services can today send data to one workspace. One thing to note when configuring multi-home data sources is that if the same data is collected and inserted into multiple workspaces, we also pay for that data twice. In other words, it is a good idea to make sure that different kind of data is collected for each workspace, for example, audit data to one workspace and application logs to another.

The following figure describes a scenario where two application teams have their own workspaces, and there is one workspace for central auditing. The auditing team needs access to data from both service workspaces, to run analyzes and verify that everything is running according to company policies.

To deploy this scenario simple deploy three workspaces and give the central auditing team read permissions on each service workspace, see Microsoft Docs for more details.

Cross workspace queries

The next step is to start author queries to analyze and visualize the data. The data is stored in each service workspace so the Auditing team will need to use the cross-workspace query feature, read more about it (https://docs.microsoft.com/en-us/azure/log-analytics/log-analytics-cross-workspace-search). Data is only stored in the two service workspaces, there is no data in the central auditing workspace.

The following query is a cross workspace query example, query two workspaces and list failed logon events. In the query we use “isfuzzy” to tell Log Analytics that execution of the query will continue even if the underlying table or view reference is not present. We can also see the two workspace ID, one for each service workspace, and that we use the SecurityEvent table.

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).SecurityEvent, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).SecurityEvent
| where EventID == 4625 and AccountType == ‘User’
| extend LowerAccount=tolower(Account)
| summarize Failed = count() by LowerAccount
| order by Failed desc

The following example shows all failed security baseline checks for the two service workspaces

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).SecurityBaseline, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).SecurityBaseline
| where ( RuleSeverity == “Critical” )
| where ( AnalyzeResult == “Failed” )
| project Computer, Description

To make cross workspace queries a bit easier we can create a function. For example, run the following query and save it then as a function.

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).SecurityBaseline, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).SecurityBaseline

We can then call the function in our queries, for example, to get all failed security baseline checks. We don’t need to specify workspaces to join, as they are handled by the function.

| where ( RuleSeverity == “Critical” )
| where ( AnalyzeResult == “Failed” )
| project Computer, Description

Another way of using saved functions is the following example.
First, we have a saved function named ContosoCompMissingUpdates listing all computers that are missing updates.

union isfuzzy=true
workspace(“b111d916-5556-4b3c-87cf-f8d93dad7ea0”).Update, workspace(“0a9de77d-650f-4bb1-b12f-9bcdb6fb3652”).Update
| where UpdateState == ‘Needed’ and Optional == false and Classification == ‘Security Updates’ and Approved != false
| distinct Computer

We can then use the ContosoCompMissingUpdates function within a query showing machines with failed Security baseline checks. The result is a list of machines missing updates and with failed baseline checks.

| where ( RuleSeverity == “Critical” )
| where ( AnalyzeResult == “Failed” )
| where Computer in (ContosoCompMissingUpdates)
| project Computer, Description


Disclaimer: Cloud is a very fast-moving target. It means that by the time you’re reading this post everything described here could have been changed completely.
Note that this is provided “AS-IS” with no warranties at all. This is not a production-ready solution for your production environment, just an idea, and an example.