Row level security SQL and Power BI

In my previous post I explained how to enable row level security on your SQL database. However, if you use Power BI as a live connection to your database, it will filter according to the user that is used in the data gateway. This effectively means that you cannot rely on sql row level security if you use Power BI on top of it. With SSRS it is not a problem, as you can impersonate the original user rendering the report, even if you use stored credentials for your dataset.

This also applies to Azure SQL: live connection with Azure Active Directory authentication appears to be not possible yet (thanks to my colleague Jerrold Stolk for trying that out 🙂 ).

What does this mean?

It means you may still want to implement your security in either Power BI or in SSAS. I would recommend SSAS Tabular model because end users cannot alter the row level security. If you define it in Power BI itself, you would have to prevent users from directly downloading a pbix file and altering the security. Personally I think it is much wiser to define the row level security on the source/backend and not in the reporting tool.

And at this moment, the only way to do this nicely is SSAS.

 

3 Important Lessons about Test Automation in Business Intelligence

As you might know, I'm keeping an eye on how I can automate my Data Warehousing testing. In "Automated Testing for the Data Warehouse" my observation was

In essence, testing a Data Warehouse isn't that complex

It turns out I'm not the only one thinking about this (really? 😉 ). Some time ago I listened to episode 72 from SQL Data Partners podcast, starring Lynn Winterboer. Really deep insight in the requirements for test-driven Data Warehouse work there, I definitely recommend listening the full podcast

Continue reading...

Row level security on your SQL database

One of the cool new features of SQL 2016 or SQL Azure is that you can add row level security to your database directly.

So you don't need to use a cube or a tabular model to implement row level security. Let me show you how you can use it. The example is based on Adventureworks, the sample database from Microsoft. In this example I am going to show how you can restrict acces to certain colors of products (not really a realistic scenario I guess, but it's done pure as an example).

First we are going to create a table where we store for each user what colors he can see.

second, we are going to create a function that checks wether a user can see a certain color:

Make sure to add yourself to the userrights table, or add an exception in the function below to make sure certain roles can see al colours regardless.

and lastly, we need to enable row level security on the products table:

There you go!

You may want to restrict acces based on roles rather than users, but you can build from this example and try it out yourself 🙂

 

 

Recover the lost query

Just recently I was writing a very complicated query. Took me 2 hours to write. In the end I saved it as a view in the development database. I did not yet synchronize it back to the solution in visual studio, because I wanted to add a few related views for the problem I was solving. Maybe it was better to develop from visual studio, but sometimes you just want to see the results from the query, and merge it back to visual studio later.

At that moment however, a colleage of mine did a publish from his version of Visual Studio to the database on the development server. Gone were all my views and my hard work. 🙁 Lessons learned of course. But then I felt a mission. The mission to find my lost query. I would rather try two hours to recover my query than to rewrite it from scratch. In the end I found this gem:

(taken from: http://stackoverflow.com/questions/15034832/recover-unsaved-sql-query-scripts)

And I did find my lost query!

So if you ran a query to your SQL database and accidently closed the session, you might want to try this to find it back.

Strangely enough I did not find the alter view statement, but I did find my select statement. So you might have to search a little. But at least there is hope.

Trim the fat (put your data warehouse on a diet)

The idea of organizational Business Intelligence was to create "One version of the truth, One Ring version to rule them all" errrr, well that was the idea. Away with all those Excels and bring on the conformed BI reports right? Of course it is a great idea to make sure you have a data warehouse where your data is guaranteed "right". For example, if you want to know how much profit you made last year, you don't want to have multiple interpretations!

How should we then view the rise of Power BI and self service BI? Hello different versions from the truth?

The real question is: how many reports need to be exactly unique across the whole enterprise? How much integration do you need? In the past, organizations tried to fit everything in a big Enterprise Datawarehouse. Just saying EDWH makes managers shiver, remembering huge costs and little payback. Or, big BI competence centers that are handling a huge backlog, because everything has to come from the EDWH. Lots of waiting time and saying "no" because there is too much work.

Sure it helps to have master data available across the business, and data quality is not a bad idea either. But you don't have to push ALL the data into the data warehouse. I think there is more data where you can give the data analist freedom to decide how he wants to see the data than we used to think.

Instead of this situation, where the blue is the managed BI or enterprise datawarehouse and purple is the self service BI layer:

 

We could "trim the fat" of the data warehouse and give users more freedom to go to this situation (taken from Motion10's data platform model):

 

The most important element here is to make sure you have a good data catalog, where users can easily find datasets they want to use for their analysis. It will be interesting to see how Azure Data Catalog can play a role in this. At the moment I often use the datasets in Power BI in combination with content packs as a catalog. All datasets have clear understandable names. That is as good a catalog as any other. If the report is complicated in any way, the first page contains a short self explaining description.

 

 

Best Practice: no more queries in frontend tools

You may have experienced it: your report was working just fine, untill you published a database change and oops! a report doesn't work anymore. Especially when you have about 70+ reports running their own queries, this can be a pain.

Why would a report contain a query? Well, because it is easy, quick to develop, and why not?

The reason is, you don't get to manage all those queries at all. They are the hidden queries, long forgotten, never tuned and unfindable in your database.

Please use views or a stored procedure to feed your datasets in reporting services. If the query lives in the database and you use a daily build to check if the code still works, then you will be notified if the query is broken, before you publish it to production. Same goes for all front end tools, including Power BI. At least use a view. The other tools going to the database should not know the datamodel at all.

The idea is that a view can remain consistent while the underlying datamodel can change as needed.

Azure Data Factory, first thoughts

I have been trying out Azure Data Factory.

You can wonder if ADF is more like SSIS in the cloud or SQL Server Agent in the cloud. It does not quite feel like a pure ETL tool. Coming from SSIS on premise this is a completely different experience. My first impression is that the UI is not very intuitive. It takes a while before you realize where you need to be when solving a problem. Another thing that I found was that there is not very much about it online. Stuck with a problem? you're likely to be on your own. Googling my error message got me nothing, that's quite refreshing for me (unthinkable for a SQL Server error) 🙂

It is quite easy to copy data from A to B. In other words: you can do E L: Extract and Load. The T is not very easily done. If you want to get some data from Azure Blob storage and load it into your Azure SQL, this is a good tool. But if you want to do complex transformations? Not so sure yet...

 

 

 

Sharing your Power BI reports

If you are using Power BI, you may want to consider how you are going to share your reports and dashboards. Should you share your dashboards? Should you use groups or content packs?

There are several ways to do this.

 

  1. Share your Dashboards: open your dashboard and click on the "Share" Icon. Your recipients can open the underlying reports but they cannot edit your report or create new reports. This is great for situations where you just want your users to see data, but not make reports on their own.
  2. Create a content pack. This way, you can share a bundle of reports, dashboards and data sets. The recipients cannot alter the reports they recieve. But they can make a copy of it for themselves and edit that one. Also they can create new reports on the same dataset. This is a great way to bring out a bundle of formal reports which will always stay they same, yet still allow personal versions of the report.
  3. Create a group: if you create a group, all participants of that group can alter the reports and co-create contentpacks. This is great if the people all work together on content and have the same rights.
  4. Publish to web. Ok this is not really sharing, this is straight out publishing and should be done ever so carefully!

You can mix it together too: at this moment I have a Power BI group where several users can edit the reports. Together they publish the content pack to the rest of the organization that consumes the reports.

How do you use Power BI?

 

Your data is going places

There are still a lot of people who do not believe BI and cloud go together.

"My data is too valuable to store anywhere else but on premise!". Most of the times we are talking about gut feeling here. It is not a strange thought either. But the funny thing is: maybe your data is already in the cloud...

Using office 365? your data is in the cloud.

That report someone put in their dropbox? it's in the cloud.

Someone in your organisation using Power BI and publishes it to powerbi.com? you are in the cloud.

This may seem obvious, (and it is). But unless your data governance is tight in control, your data may already be going places.

How good is your data governance?

Building the Poor Man's IoT Presentation (+ inspiration for new posts)

As you might remember, I ran a few posts covering Azure IoT Hub early this year:

Last Saturday, I did a talk on SQL Saturday in the Netherlands covering my experiences. If you're interested in my presentation, the slidedeck is on the SQLPass website now: http://www.sqlsaturday.com/551/Sessions/Details.aspx?sid=51033.

Also, I got some nice inspiration there for new posts and developments, of which I plan to blog soon. Something IoT-y and some other things about automated testing, deployments and CI:

Continue reading...