Monthly Archives: May 2013

Dimensional Models are like Tattoos

So in the comments on a recent post on Risk Driven Architecture, Jamie Thomson asked whether the problems associated with change can be mitigated by using views. I firmly believe that views can help but unfortunately not enough to save you from clients that connect directly with Analysis Services cubes.

So it got me thinking about a similar mitigation for cubes. Unfortunately nothing came to mind apart from an analogy:

Dimensional models are like tattoos – you have to live with them for a long time

Why you might ask? Well you can add to them, maybe fill in some extra colour but basically once you’ve committed to you are stuck with them because every spread sheet and report using your model will need fixing if you try to remove something. Like tattoos, you can remove them but its going to be painful and cost a lot of money.

I don’t have any tattoos (not because I don’t like them, I just can’t decide on one that I’d have to live with for so long). However I’ve heard plenty of guidance about taking your time before committing – one of the best techniques is to simply draw your new tattoo with a Sharpie and try it on for size for a while.

How does this help with dimensional models? Well the same techniques apply. Try a new model on for size, especially if you can arrange it for the new model to fade like the Sharpie as time passes which automatically limits client usage. Maybe process the cube manually for a while – your users will soon tell you if the data is useful. This fits with an agile approach too - only put measures and attributes in the cube if you need them and don’t add stuff in the hope that it will be used productively.

This entry was posted in business-intelligence and tagged #dimensional-model on May 26, 2013 .
Discuss this on Twitter or LinkedIn

Mental Health Dimensional Model

For this week’s post I want to continue the sample solution. Even though I’m going to be as agile as possible we still need to have a rough idea of a roadmap and the best way to do that is with a dimensional model.

Each business process we want to model is represented as a fact on columns. They are all to be stored at the transactional grain except possibly admissions. The conformed dimensions are listed on rows with the facts they are related to.

	Referral	Assessment	Treatment	Discharge	Complaint	Incident	Admission
Date	•	•	•	•	•	•	•
Diagnosis		•	•
Health Professional		•	•		•	•	•
Patient	•	•	•	•	•	•	•
Referrer	•
Service (Tier)	•		•	•			•
Time	•		•	•		•	•
Treatment Outcome			•
Clinic		•			•	•	•

It is interesting to note that this is a very patient focused model since that dimension is related to every fact. There are some unanswered questions within the model though:

How do we represent treatment outcomes? Is there a standard method? Can this be represented in a single dimension?
What grain are admissions? Given the goal of calculating ‘bed days’ we might need to model them as events in progress.

I think we have enough to make a start and I don’t think we will deliver faster if we stop to resolve these issues first. Initially I’m going to concentrate on referrals, assessments and discharges since the number of patients in the system is one of the most useful metrics to monitor.

This entry was posted in sample-solution and tagged #dimensional-model #mental-health #mental-health-project on May 18, 2013 .
Discuss this on Twitter or LinkedIn

Acronyms in Dimensional Models
Whilst researching the previous article I came across this link on Acronyms and Ubiquitous Language. It is well worth reading as everything discussed also applies to dimensional models. There is one quote that I want to reprint from the .NET Framework General Naming Conventions:

Do not use any acronyms that are not widely accepted, and then only when necessary.

Your business users should be able to point Excel (or whatever tool you are using) at a dimensional model and intuitively know what the measures, dimensions and attributes are because they describe the business your users work in. Since acronyms obfuscate meaning they don’t belong in dimensional models.

The only time I generally relax this rule is when both the following are true:

All business users know the meaning of the acronym

The expanded version is so long that it becomes unwieldy
This entry was posted in business-intelligence and tagged #dimensional-model #domain-driven-design #naming-conventions #ubiquitous-language on May 10, 2013 .
Discuss this on Twitter or LinkedIn
Applying Domain Driven Design to Data Warehouses

My commute is around two and a half hours each way so I read a lot on the train. One of the subjects I’ve recently become interested in Domain Driven Design or DDD. I’ve found it isn’t really a new topic for me but more like someone has documented many of the techniques I’ve always used.

DDD discusses data warehouses primarily as an output or reporting function within a larger application. The book Implementing Domain-Driven Design by Vaughn Vernon mentions reporting repeatedly as a by-product of DDD (particularly when used with Event Sourcing) but not directly as a possible use case.

I would agree that not all concepts can be reused in data warehouse solutions since the only interface available is often one that transfers a set of mutations (property value changes) without the accompanying reasons (it is a key idea in DDD that you need to design your model not by changes in attributes but by operations performed on entities). For example an order count has decreased and the reason is missing – was the order returned, cancelled, an error etc. So where can it be applied? Are any of the concepts useful when designing data warehouses?

Ubiquitous Language

Lets start with one of the core concepts – Ubiquitous Language is a rigorous shared language used between developers and users. It is used to make sure that conversations are accurate and productive. It should evolve as the team’s understanding of a domain changes. The ubiquitous language is what forms the domain model at the heart of a software solution.

I find this description very similar to an equivalent concept in data warehousing – the Dimensional Model. This model, and its associated dimension bus matrix, is based on real business processes and terminology. The dimensional model is the public face of the data warehouse. It needs to be precise, reflect the terms used by business users and form a common vocabulary between users and the development team. For example when browsing an Analysis Services cube in Excel, the dimensions and facts defined in the dimensional model are directly visible to end users – if they don’t automatically understand what is on-screen then the model doesn’t describe the business.

Entities and Value Objects

There are two types of object in domain driven design – entities and value objects. Value objects are immutable and identified via their attributes. For example $100 USD in one object is the interchangeable with $100 USD in another. Entities on the other had cannot be identified purely by their attributes – there must be some sort of unique identifier (in data warehouse terms this is a business key) to differentiate similar entities. For example, one John Smith may not be the same as another and need a Customer-Id to differentiate the two.

With respect to dimensional models, value objects should not be implemented as top-level dimensions but instead be added at sets of attributes to the entities which own them. For example ‘Product Colour’ is a value type (colour) and should belong in the ‘Product’ dimension. This seems obvious when written this way but happens a lot.

Entities and Aggregate Roots

DDD groups sets of closely related entities under the control of a single ‘Aggregate root’. Some entities make no sense unless the parent entity is also within context; order lines and parent orders are the typical example.

So value objects shouldn’t be dimensions and I don’t think ordinary entities should be either. True dimensions are the aggregate roots and the one thing that seals it for me is that an aggregate root (according to DDD) defines a transactional boundary – you should not update multiple aggregate roots within a single transaction; instead sagas keep your data warehouse in sync (eventually).

Finally

Domain driven design and business intelligence share a number of common concepts yet the two philosophies are rarely seen as related. I think there is a lot to be gained by applying software concepts from different viewpoints which may not ordinarily be considered.

This entry was posted in business-intelligence and tagged #data-warehouse #domain-driven-design on May 3, 2013 .
Discuss this on Twitter or LinkedIn

Monthly Archives: May 2013

Dimensional Models are like Tattoos

Mental Health Dimensional Model

Acronyms in Dimensional Models

Applying Domain Driven Design to Data Warehouses

Ubiquitous Language

Entities and Value Objects

Entities and Aggregate Roots

Finally