With Continuous Integration working for my Data Warehousing solution (SQL Server, SSAS Tabular, SSIS), it's time to step forward. Before moving to Continuous Deployment, I want to have a rigorous and automated testing on my EDW. But how to get there?
As you might know, I'm keeping an eye on how I can automate my Data Warehousing testing. In "Automated Testing for the Data Warehouse" my observation was
In essence, testing a Data Warehouse isn't that complex
It turns out I'm not the only one thinking about this (really? 😉 ). Some time ago I listened to episode 72 from SQL Data Partners podcast, starring Lynn Winterboer. Really deep insight in the requirements for test-driven Data Warehouse work there, I definitely recommend listening the full podcast
Wait, didn't I post this already several weeks ago? Well, almost. A few weeks ago, I showed how to set up a build agent using devenv.com. Unfortunately, I ran into some problems like failing builds not reporting failure and SSAS Tabular projects not building correctly. However, it turns out to be pretty easy to build almost everything using MSBuild. Here's how.
In my earlier post "Automated Deployments using Visual Studio" I metioned that the method described was a workaround because I hadn't figured out how to do a continuous build in VSTS (Visual Studio Team Services) yet. With hindsight, that workaround was not really needed: VSTS build turns out to be só easy, that it's ridiculous to try anything else. If you, like me until yesterday, don't know where to start to set up continuous build within VSTS, you're lucky. Below is a step-by-step explanation to setup your automated build for VSTS.
In my earlier post "Automated Testing for the Data Warehouse", I sketched the outlines of what would be needed in order to achieve automated testing for your Data Warehouse solutions. Today, I want to look at the first step: build & deploy. Between the previous post and the current one, some useful content about this has been written already by Jens Vestergaard - he even uses VSTS to do his builds, something I still have to look into. Meanwhile, here's my method of acquiring the latest sources & building them using Visual Studio.
Case: we've integrated two sources of customers. We want to add a third source.
Q: How do we at the same time know that our current integration and solutions will continue to work while at the same time integrating the new sources?
A: Test it.
Q: How do we get faster deployments and more stability?
A: Automate the tests, so they can run continuously.
When integrating data, especially in agile environments, our already-integrated data is very likely to get some more integration. So WHY does automated testing happen so rarely within Data Warehouse projects?
Last week, I had an interesting question from one of my clients about estimates of planned commute distances. For example: What is the distance I travel when I, living at Kalverstraat 7 (Amsterdam) commute five days a week to my customer in Paris (1, rue Victor Cousin)? By the way: it should be calculated for all employees, for all customer locations, so manually entering distances is not considered an option.
In one of my projects, I had two simultaneous sequence containers. One is executed always, and one only rarely. So I decided to control the Task Flow with a package parameter. As soon as there was task I needed to do once all paths were completed, I ran into SSIS behaviour that was (to me) unexpected.
Although you can hardly ignore BIML (with bloggers like Cathrine Wilhelmsen and Bill Fellows sharing good stuff), there are still other options for package generation in SSIS. While enjoying a small conversation on LinkedIn about the automation options, I suddenly remembered still having a slidedeck from a presentation I gave early 2014.
Please note that these are my early views on Biml, dating from end-2013. Please do not base any decisions on this: things are outdated (BimlExpress now provides free syntax highlighting for BimlScript inside Visual Studio; BimlT files can alter existing packages) and technology descriptions may not be 100% accurate (nothing confirmed by Varigence)
The presentation was about SSIS 2012, so not everything will work the same way in SSIS 2014+. For example: from what I've heard, EzAPI doesn't have support for SQL Server 2014+. I'm planning to write an update shortly, but meanwhile this might prove useful for gathering a high-level overview.