In my earlier post "Automated Testing for the Data Warehouse", I sketched the outlines of what would be needed in order to achieve automated testing for your Data Warehouse solutions. Today, I want to look at the first step: build & deploy. Between the previous post and the current one, some useful content about this has been written already by Jens Vestergaard - he even uses VSTS to do his builds, something I still have to look into. Meanwhile, here's my method of acquiring the latest sources & building them using Visual Studio.
As you might remember, I ran a few posts covering Azure IoT Hub early this year:
Last Saturday, I did a talk on SQL Saturday in the Netherlands covering my experiences. If you're interested in my presentation, the slidedeck is on the SQLPass website now: http://www.sqlsaturday.com/551/Sessions/Details.aspx?sid=51033.
Also, I got some nice inspiration there for new posts and developments, of which I plan to blog soon. Something IoT-y and some other things about automated testing, deployments and CI:
Working with the newest technologies is great and frustrating. I can create solutions I’ve never made before while complaining about the outdated/incomplete documentation.
I had one of those feelings while working with Azure Stream Analytics (ASA). My solution worked but there was one ‘elementary and simple’ thing I wanted: Start the ASA-jobs within my C#-code. That shouldn’t be hard and there’s some documentation. But no, I needed to combine several opposed solutions to a new one to make it possible. Continue reading...
Last week, we trained an xgboost model for our dataset inside R. In order to use your trained dataset in Azure ML, you need to export & upload it much like we did two weeks ago in Python. Today, I'll show how to import the trained R model into Azure ML studio, thus enabling you to use xgboost in Azure ML studio. If you combine last week's knowledge of using xgboost with today's knowledge of importing trained xgboost models inside Azure ML Studio, it's not too hard to climb the leaderboards of the (still ongoing) WHRA challenge!
Case: we've integrated two sources of customers. We want to add a third source.
Q: How do we at the same time know that our current integration and solutions will continue to work while at the same time integrating the new sources?
A: Test it.
Q: How do we get faster deployments and more stability?
A: Automate the tests, so they can run continuously.
When integrating data, especially in agile environments, our already-integrated data is very likely to get some more integration. So WHY does automated testing happen so rarely within Data Warehouse projects?
Last Azure ML Thursdays we explored how to do our Machine Learning in Python. Python in Azure ML doesn't include one particularly succesful algorithm though - xgboost. Python packages are available, but just not yet for Windows - which means also not inside Azure ML Studio. But they are available inside R! Today, we take the same approach as two weeks ago: first, we move out of Azure ML to do our first ML in R, then (next week) we'll upload and use our trained R model inside Azure ML studio.
Today, I'll show you how use xgboost on the still ongoing Cortana Intelligence Competition "Women's Health Risk Assessment" (WHRA). At the moment of writing, the leaderboard stayed the same for over three weeks, with only 336 participants - but ending in a week, with a grand prize of $3,000.
So rush to participate, and use the knowledge shared here to win - all code presented below can be run in order and will result in a trained model for the WHRA dataset!
On this fourth Azure ML Thursday series we move our ML solution out of Azure ML and set our first steps in Python with scikit-learn. Today, we look at using "just" Python for doing ML, next week we bring the trained models to Azure ML. You'll notice there's a lot more to tweak and improve once you do your Machine Learning here! ML in Python is a quite large topic, so be many subjects will only be touched lightly. Nonetheless, I try to give just enough samples and basics to get your first ML models running in there!