Better software through software architecture and devops

@jamessnape

Posts

  • Tableau Visualization

    This week has been dominated by the Tableau Customer Conference. I was fortunate to get a ticket since it was sold out but one of our architects couldn’t go so I filled in. I’m glad I did.

    It’s been a while since I got to learn about a completely new technology so it is a refreshing change to be a bit of a novice. After a number of Microsoft conferences this one felt quite different too – less geeky with a more mixed crowd. It was interesting to be able to talk with non-technical types such as data analysts, business managers and statisticians.

    I mainly went to the technical sessions but a couple of the keynote sessions were really interesting. Firstly ‘Creating a culture of data at Facebook’ gave some useful ideas about creating communities and getting more staff comfortable with visualizations. It was also nice to listen to a blogger I’ve read for a while (but only just discovered worked for Facebook). The second was Prof. Hans Rosling. I’ve seen his TED talk but in person was completely different – probably because he was talking to a room full of data visualisation professionals. He had plenty of anecdotes about how his famous visualizations came about. Ellie Fields gives a good description of his talk.

    So back to the day job now but with some new ideas about business intelligence and data visualization.

    This entry was posted in data-visualization  and tagged #conference #data-visualization #hans-rosling #tableau #tableau-customer-conference  on .
    Discuss this on Twitter or LinkedIn
  • Last time we were looking at the mental health project I was discussing the dimensional model. I think its time to have a crack at some code now. But this first session is just about setting up my project.

    There are some key things every agile project should do:

    • Automated build with acceptance and unit tests
    • Automated code analysis
    • Automated deployment with integration tests

    Note everything is automated - it has to be repeatable and not need human intervention or it won’t get done. I’m a big fan of continuous integration and continuous deployment so I’m going to use Team City as a build service since its free for a single agent.

    Team City is a very configurable and powerful tool but I want to make sure that I can build and deploy from my local command line in exactly the same way that the Team City agent will since it makes debugging issues easier and allows developers to check the build works before committing.

    There are lots of build script tools around such as FinalBuilder but I prefer MSBuild since its readily available and a text format. Visual Studio uses MSBuild internally but we are not going to change project files; we are going to create a higher level script to tie everything together. Since this is a simple start it’s all going in one build file.

    https://gist.github.com/jsnape/5730292

    The build script is split into 2 main parts. At the top are property and item definitions – this is the build metadata controlling what and how the build will happen. Below that are Imports and Targets which deal with the mechanics of building. This split makes it easy to add new projects and settings without having to change your overall build script.

    There are four main targets listed which are Clean, SourceAnalysis, Compile and Test. The last three of which make a build. It’s fairly self-explanatory but if you don’t know MSBuild script imagine anything in a $() is a single value or variable, @() is a list of items. Each target has a list of tasks which are executed in order to complete the target.

    So, this script is very simple; it just runs StyleCop over a set of source files, builds a Visual Studio solution and runs Xunit against a set of assemblies. Not much but it gives us a single command line action to build and test the solution as we add features:

    PS> msbuild draco.proj

    This is then setup as a single step in TeamCity. Every check-in causes the build to run and tests to execute.

    The complete set of source for this project is available at https://github.com/jsnape/draco.

    This entry was posted in sample-solution  and tagged #build #build-automation #build-management #ci #continuous-integration #deployment  on .
    Discuss this on Twitter or LinkedIn
  • tattoo work by Keith Killingsworth source http://commons.wikipedia.org/wiki/File:Tattoos.jpg

    So in the comments on a recent post on Risk Driven Architecture, Jamie Thomson asked whether the problems associated with change can be mitigated by using views. I firmly believe that views can help but unfortunately not enough to save you from clients that connect directly with Analysis Services cubes.

    So it got me thinking about a similar mitigation for cubes. Unfortunately nothing came to mind apart from an analogy:

    Dimensional models are like tattoos – you have to live with them for a long time

    Why you might ask? Well you can add to them, maybe fill in some extra colour but basically once you’ve committed to you are stuck with them because every spread sheet and report using your model will need fixing if you try to remove something. Like tattoos, you can remove them but its going to be painful and cost a lot of money.

    I don’t have any tattoos (not because I don’t like them, I just can’t decide on one that I’d have to live with for so long). However I’ve heard plenty of guidance about taking your time before committing – one of the best techniques is to simply draw your new tattoo with a Sharpie and try it on for size for a while.

    How does this help with dimensional models? Well the same techniques apply. Try a new model on for size, especially if you can arrange it for the new model to fade like the Sharpie as time passes which automatically limits client usage. Maybe process the cube manually for a while – your users will soon tell you if the data is useful. This fits with an agile approach too - only put measures and attributes in the cube if you need them and don’t add stuff in the hope that it will be used productively.

    This entry was posted in business-intelligence  and tagged #dimensional-model  on .
    Discuss this on Twitter or LinkedIn
  • For this week’s post I want to continue the sample solution. Even though I’m going to be as agile as possible we still need to have a rough idea of a roadmap and the best way to do that is with a dimensional model.

    Each business process we want to model is represented as a fact on columns. They are all to be stored at the transactional grain except possibly admissions. The conformed dimensions are listed on rows with the facts they are related to.

    ReferralAssessmentTreatmentDischargeComplaintIncidentAdmission
    Date
    Diagnosis
    Health Professional
    Patient
    Referrer
    Service (Tier)
    Time
    Treatment Outcome
    Clinic

    It is interesting to note that this is a very patient focused model since that dimension is related to every fact. There are some unanswered questions within the model though:

    • How do we represent treatment outcomes? Is there a standard method? Can this be represented in a single dimension?
    • What grain are admissions? Given the goal of calculating ‘bed days’ we might need to model them as events in progress.

    I think we have enough to make a start and I don’t think we will deliver faster if we stop to resolve these issues first. Initially I’m going to concentrate on referrals, assessments and discharges since the number of patients in the system is one of the most useful metrics to monitor.

    This entry was posted in sample-solution  and tagged #dimensional-model #mental-health #mental-health-project  on .
    Discuss this on Twitter or LinkedIn
  • Whilst researching the previous article I came across this link on Acronyms and Ubiquitous Language. It is well worth reading as everything discussed also applies to dimensional models. There is one quote that I want to reprint from the .NET Framework General Naming Conventions:

    Do not use any acronyms that are not widely accepted, and then only when necessary.

    Your business users should be able to point Excel (or whatever tool you are using) at a dimensional model and intuitively know what the measures, dimensions and attributes are because they describe the business your users work in. Since acronyms obfuscate meaning they don’t belong in dimensional models.

    The only time I generally relax this rule is when both the following are true:

    • All business users know the meaning of the acronym
    • The expanded version is so long that it becomes unwieldy
    This entry was posted in business-intelligence  and tagged #dimensional-model #domain-driven-design #naming-conventions #ubiquitous-language  on .
    Discuss this on Twitter or LinkedIn