Data quality is everything

When you lead a company every decision you make has an impact. And while a lot of people make decisions on gut feeling there is a big rise of big data analytics, BI and data driven insights. Every decision made upon data however is linked to the quality of that data. And data quality is not always as it should be.

Bad data quality is everywhere because there is a lot of human input, and humans make errors. Or, there are a lot of data sources with different definitions, because people needed a solution fast. And a lot of different solutions with poor integration guessed it, poor data quality.

The answer that a lot of BI consultants give to ensure data quality is often an enterprise data warehouse. Suggesting that if all the data in that data warehouse is cleansed, checked and approved, all decisions flowing from that data must be good. Another advise given may be to buy a certain data quality tool. The biggest trap you might fall for is that there is not a technology that can solve data quality, rather it is the people using and creating data that should make a shift in thinking.

Let's make one thing clear: data quality is not a tool, nor something a platform provides. It is a matter of culture.

Of course, if you ask Microsoft, data quality is very much a tool. They released Data Quality services and Master Data Services around 2012 with mixed success. We are waiting for a definitive Azure version of those products. Or rather, something new on that front. But ultimately, it all comes down to people using and creating data.

Let's state it again: data quality is not a product, it is a culture. It should be a part of the data driven culture that you want to see in your office. It is everyone's responsibility to keep data at a high quality level. Tools and platforms can help you find data issues, but solving them is best done in the source systems. By cleansing data in the source, you can be sure that all subsequent databases and reports are using the correct data.

That culture is more important is because a platform or a product might not be used unless people see it as important. The culture should be so that people want to fix data issues immediately because they understand what the consequences might be if the data is off. Another important aspect of the culture is that one feels responsible for not just their report, solution or data silo, but that they understand data is an ecosystem for the company as a whole that needs to be maintainable.

Another reason why data quality may be bad is when the culture is focused too much on delivering value quickly. Delivering solutions fast is a good thing, but may also lead to a swamp of inconsistent power bi reports and lack of integration. Ideally you deliver a protoype fast when needed, and replace the prototype with an integrated solution after. But try to explain the business users that you are going to spend another 2 weeks on delivering the same Power BI report, but now "integrated". This is why the culture needs to be as much company wide as possible.

Creating a data driven culture can be difficult. You may need to start an onboarding training for employees before they can access the databases or self service BI tools. Business intelligence is often seen as the domain for IT and technical people. But it is the business that is even in the name of BI, and the business is what it is all about. Most courses and training focus on the technical aspects, but an onboarding is maybe even more important: how do we use data in this company? And that is the business side of things.

Creating a data driven culture with emphasis on quality of data is essential if you want to be truly successful. It is also key in how you are going to maintain balance between self service BI reports and enterprise reporting: it is a matter of people making agreements on how to handle the difference, when to use what data source, how are we going to communicate? how are we going to share data?

There is no right strategy for everyone as long as people follow it and feel responsible. And that, is not a thing a tool can accomplish.

Principal BI consultant at Rubicon

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.