This is the start of a series wherein I will cover the Microsoft’s Business intelligence toolset from a IT Admin, DBA’s, and Developers perspective. Each post will cover one of the different tools and solution currently available under the Microsoft BI umbrella as well as best practices and common pitfalls. Having a basic grasp of what’s out there is important when it comes to supporting your users and might prevent developers from re-inventing the wheel again.
Power Query – Analysts love Excel
Within this article we will have a closer look into the first tool on the list; Power Query for Excel. Power Query is an Excel Add-in which simplifies data discovery, access, transformation and collaboration. The add-in isn’t installed nor can it be found on the installation media and therefore requires a separate download which can be found here: https://www.microsoft.com/en-us/download/details.aspx?id=39379. Deployment can be automated using Group Policy Software installation packages if desired.
Perhaps surprising for many, is that most of the tooling and solutions heavily rely on Microsoft Excel. This is a good thing because Excel is familiar to just about everyone with a computer and a keyboard. And therefore allowing people of different skill levels to leverage the capabilities provided by Microsoft’s BI platform.
For the end users, Power Query is the first tool when it comes to building reports and analyzing data. The add-in will help the user with finding internal and external data sources (discovery), connecting to a variety of predefined data source templates (access) and transform the data to the desired format. And last save and share the queries, build using the query editor, within your organization (collaboration).
Let’s take a closer look into the points highlighted above.
In Excel, the data tab is what most people have used to import data. However it is showing its age due to rapid changes in how we store and process data. Due to this there is the need for a better and simpler way of connecting to these data sources. This problem is solved by Power Query, which allows you to connect to data sources using one consistent experience. The source can be text, a relational database, OData, Hadoop, Facebook, Salesforce and many more.
Data is rarely structured and formatted in the way you would like it to be and that’s where the Query Editor comes in place. Within the Query Editor you are able to make simple changes like including or excluding columns as well as more advance options as alter the underlying query. This is also the area where a developer or power user could assist by providing/sharing prebuild queries, which then can be used as a template for others.
Apart from this, the Query Builder also includes a wide verity of very powerful transformation options like setting the datatype, transpose, pivot, unpivot and merge just to name a few. This will allow you to shape a local copy of the data and create one master data set out of multiple data sources.
NOTE: Note: it’s always possible to retrieve the latest version of data by refreshing a query. This will keep the formatting you established while editing. And therefore don’t have to worry about losing any of your data shaping work.
3 – Discovery
In addition to this, Power Query has an option to search for data sources published internally or located on the web. There is a wide variety of business related data available online and some of the data souses have been curated by Microsoft however this is more related to structure than validity of the data itself. The heading of the search dialog is a bit misleading however internal data will be listed under the Organization tab and public data available online under Public.
I highly recommend trying out some simple scenarios, even if you’re not an end user. I’m a developer myself and like working with data and therefore enjoyed playing around connection to various data sources including my Facebook data. This also helped me to get a deeper understanding on how data can be consumed which is important if you would like to expose new data sources for your organization.
4 – Collaboration
Queries can be shared within your organization for both private and public data sources. Sharing a query won’t automatically grant access to the underlying data. This is still the managed by the data source.
Exposing data sources
This topic will be covered in greater detail in a future article however here are some thing that you might want to keep in mind before exposing data.
When it comes to exposing data it’s important to keep your end users in mind. If they are experienced analysts, then pre-shaping the data might limit them in building reports and discovering insights. On the other hand, if data is to detailed, your business users might be struggling building simple reports.
The latter problem can be solved by sharing simplified abstractions in the form of a query. But then you also need to look at overall performance. It would be very bad if the creation of simple reports are taking ages to build and refresh due to excessive amount of complex transformations.
One additional point I would recommend is to appoint a “Power user”. This person should be partially responsible for simple tasks as educating users, managing query assets, keeping an eye on performance and or grating access. But this will depend on the knowledge level of this person and organizational constraints as well.
This was Power Query in a nutshell, one of the BI tools available within the Microsoft BI suite. The next article covers Power pivot, a tool that can used to perform Analytics on top of data gathered using Power Query.