Azure IoT is out! Here's how you connect with Power BI
Azure IoT Hub is public! And if you're a little bit like me, you want to play with it NOW - which is what I did. Guess what? It turns out to be not only easy, but also pretty cheap (not to mention awesome) to set up your IoT Proof of Concept!
This Christmas holiday I spent some time fiddling with sensors to build a Proof of Concept on Azure IoT. It isn't very hard to do! To give you an idea about how easy it is to set up (and try yourself) I've covered all steps (literally) in two posts. The first one is about connecting the IoT data to Power BI.
My "Poor Man's IoT"
On a home budget, I've built a small IoT appliance doing some temperature measurements in my freezer (which is kinda old-fashioned - it doesn't even have an internet connection!). I'll go into details in a next post describing the DIY part (so you can try it at home), but here's the general setup:
This post's focus is about where the magic happens: how does data from Azure IoT move into Power BI?
Background: IoT Hub Architecture
In the Reference Documentation for Real Time Event Processing1 you'll find the following diagram with architectural layers of real-time event processing systems:
The Azure IoT Hub isn't mentioned yet here, but is positioned beside the "Event Hub", which is the left item on the second last row. This means that data arriving from sensors need to be transformed and stored before being able to see things in Power BI. The good news is, that this step has been made really easy.
Transforming and storing sensor data
The transformation part is handled in Stream Analytics. Assuming you already have set up an Azure IoT Hub, you log on to the Azure Portal and create a Stream Analytics Job:
Click "Add", fill in the details:
After you click 'create', you'll have a Stream Analytics job within a few seconds.
Because we need to transform the incoming IoT data, we'll need three parts, which are already shown to us by Azure (did I mention already it's easy?) - Inputs, Query and Outputs:
Adding an input
To add your IoT Hub Appliance as an input, click 'Inputs' -> 'Add' and fill in the details. Don't forget to point 'IoT Hub' as the source2!
After clicking 'Create', within a few seconds you'll have your input created.
Transforming the data - building a Query
The data from IoT Hub can arrive in JSON, Avro and CSV, and must be transformed towards a tabular structure in order to be consumed by other services (including, in our case, Power BI). In my Poor Man's IoT setup, the sensors already used JSON to communicate with the Field Gateway, so I decided to try if I could send that information to the cloud and do all transformations there.
Here's an excerpt of some messages the Field Gateway receives from sensors:
As soon you click on the 'Query' part of your job topology, you'll get a SQL-like script template showing up:
This language is a subset of T-SQL, with some useful extensions specifically for streaming appliances. For example, it can aggregate values over time in tumbling windows of a predefined timespan. A full description can be found at https://msdn.microsoft.com/en-us/library/azure/dn834998.aspx, common patterns are at https://azure.microsoft.com/en-us/documentation/articles/stream-analytics-stream-analytics-query-patterns/.
I added the following query, which pivots some input rows into one output-row while aggregating values from every five seconds using a tumbling window. On top of that, it adds the current time to each outputted row:
, CASE WHEN ValueType = 'max' THEN MAX(Value) ELSE NULL END AS MaxValue
, CASE WHEN ValueType = 'min' THEN MIN(Value) ELSE NULL END AS MinValue
, CASE WHEN ValueType = 'current' THEN AVG(Value) ELSE NULL END AS AvgCurrentValue
, CASE WHEN ValueType = 'current' THEN MAX(Value) ELSE NULL END AS MaxValueCurrentWindow
, CASE WHEN ValueType = 'current' THEN AVG(Value) ELSE NULL END AS MinValueCurrentWindow
, System.Timestamp AS time
GROUP BY SensorType, ValueType, TumblingWindow(ss, 5)
As soon as you click 'Test', you'll currently (Feb. 10th, 2016) have to move back to the "old" Azure Portal. There you are prompted to upload a sample file. Just put some JSON responses below each other in one '.json' file and upload it, and view the results of your query:
If the results are not what you wanted, you can just alter the query and click 'Rerun', a neat interactive way of query testing.
Adding an output - Save to Power BI
With the data available in tabular form, all there's left to do is move the data to Power BI by creating an Output. This too is not yet supported on the new portal, but is easy to do using the old one.
To add an output, open the Stream Analytics Job, click 'Outputs' and then 'Add an Output'3
Choose 'Power BI' as output type, either authorize as an existing user or register as a new user, and finally enter the Power BI settings.
Wait, what..? I don't see anything in Power BI!
If everything went okay, only two preconditions need to be met before data can show up in Power BI:
- The Stream Analytics Job must be started (click 'Run', and in a few minutes the job will be running)
- Sensor data needs to be available and processed
You read that correctly: if there's no data available, the dataset will not show up in Power BI. This is also the case if the sensor data is already "inside" Stream Analytics, but your aggregation window hasn't expired yet! In my particular case: only every five seconds the data will be communicated to Power BI.
As soon as the data starts arriving though, you can build a report - and even while building the report it will already show LIVE data from your IoT Hub:
Conclusion: Where does the magic happen?
As you can see, it's really easy to connect your Azure IoT Hub to Power BI. But where does Stream Analytics store the incoming data? It's stored inside a PBI workbook inside your workspace. So make use of the aggregation functions of Stream Analytics, or you'll soon be out of space 😉 .
To finish and summarize, let's fill in the "Magic" gap from the first image. Here's the complete picture: