Within this post, I will focus on the basics of Availability and Performance monitoring within the Azure preview portal. This includes the creation of custom dashboards, adding and altering charts, enabling diagnostics and creating alerts.
Azure Portal Monitoring basics – Before we start
I think we all agree that service monitoring is crucial to prevent potential issues from happening and assure that all failures are promptly detected, minimizing the impact on the business. However the question is; how can this be achieved within Azure? Well, as many things in life, it depends. If you don’t have a large amount of services hosted within azure, the Azure preview portal is an excellent option. (Please see the section Azure monitoring: The next level where advanced tooling is covered)
In order to start building dashboards and spamming all IT operation guys with email alerts, we first need to define what we would like to know about our services. Let’s start with the following list, which will most likely cover all of our requirements.
- Are the services running?
- Can they cope with the load?
- How do we know when something crucial happens?
Fortunately there is some great tooling available within the Azure Portal, allowing us to make all of this noticeable.
1 – Dashboard / Startboard 101
At this point open the Azure preview portal located at http://portal.azure.com By default the Portal Startboard is populated with some generic items. To remove items from the Startboard, just right click and select » Unpin from Startboard. Now, if you would like to change the size or location of an item, just right click and select » Customize which will enable the Edit mode. While in edit mode, it’s possible to drag the items around. To change the size, right click on an item and select the desired size as provided within the context menu. Click on the button called Done or right click » Done customizing
2 – Add Azure Service Health resources
Within the Right menu pane, select Browse, scroll all the way down and select » Service Health. At this point a new blade will appear listing all the services within Azure. Just select the service you are interested in, another blade will pop up on the right. Here you can select the region you’re resources are located.
NOTE: For the demo I will be adding websites (North Europe, SE Asia) , Monitoring (global) and Virtual Machines (Japan East)
3 – Adding service related information
Almost all services expose information like configuration settings, object counters or pricing depending of the type of Azure service. Some of them are exposing some useful information for our Startboard and others can be used as a simple shortcut.
Let’s see how this works by adding some new items to our Startboard. For this example I will be adding the Container count, Estimated spend and the Settings for a given storage account.
From the right menu, select » Browse » Storage and select a container. Within the details blade, right click on Containers and select Pin to Startboard. After rearranging the items, my Startboard looks like this.
4 – Enabling diagnostic collection
Before we start adding charts, it’s important to know a bit more about the underlying data. Every type of service in Azure has its own metrics, which makes sense. And every service instance has its own diagnostic collection setup. By default the diagnostic collection is only capturing a small amount of metrics making it hard to build interesting reports.
To enable / configure diagnostic (I’m using a storage for the example) Open the object and click on any of the charts within the Monitoring section. A new blade will appear with in the top a button called Diagnostics. After clicking on Diagnostics another configuration blade will show up containing all Diagnostics settings.
NOTE: it’s important to note that the collection of diagnostics has to be configured on a service by services basis. If you have a VM based auto-scaling scenario, it’s required to enable diagnostics on all pre-created VM’s.
5 – Adding charts
We are finally at the part where charts get involved. To add a chart, it’s required to start with an existing chart found on the details blade of the service. I will be using a storage account again and right click on TotalRequest today and then » Pin to Startboard.
To customize this chart, open to the Startboard, right click on the chart and select Edit Chart. The edit chart blade will display the available options (time / metrics) based on the service type and diagnostics logging options enabled.
All services expose different types of diagnostics data, so make sure to explore all available options. But as you can see, this allows the creation of some interesting Startboards.
6 – Alerts
Of course no one has time to stare at the monitor all day and this is where alerts come in.
When it comes to alerts, it’s important to keep a baseline allowing you to define what normal behavior would look like. For example; the CPU usage of a front-end service cannot be compared with an instance responsible for processing calculations in batch etc.
To create a new Alert, open the details blade of the service (I’m using a VM instance) and select Alert rules. In the Alert rules blade select New Alert.
At this point select the desired metric and conditions. For this example I want to receive an alert if the CPU utilization is over 1% for duration of 5 minutes and received the following email.
And this is what you will see within the portal. And the Alert can be added as a dashboard indicator as well.
Azure Monitoring: The next level
If you want to take monitoring to the next level, I would recommend looking at the following options:
System Center Operations Manager Monitoring
In case you already have SCOM (System Center Operations Manager) running on-premise, installing the System Center Management Pack for Windows Azure enables you to monitor the availability and performance of resources that are running on Windows Azure.
There are alternatives for SCOM which are Nagios and Zabbix. I know that there are many other monitoring platforms, but those are the only two with Azure support as far I’m aware of. So if you have existing investments in Nagios or Zabbix, it’s still possible to have a single place for monitoring your assets.
Nagios is an interesting platform and often used in Linux-based environments. The Nagios Azure plugin is available on Microsoft Open Technologies
Nagios for Azure Monitoring and Alerting
Azure Operational Insights
Azure Operational Insights (formerly known as System Center Advisor) is a service that analyzes Microsoft server software. Advisor collects data from your installations, analyzes it, and generates alerts that identify potential issues (such as missing security patches) or deviations from identified best practices with regard to configuration and usage. Advisor also provides both current and historical views of the configuration of servers in your environment.
Application Insights enables the option to quickly diagnose problems in your live application and understand what your users do with it. Data collection can be enabled by leveraging the SDK within the Visual Studio solution or installing the agent on the webserver.