Solving the ‘Events in Progress’ problem in MDX, Part 1
I wouldn’t admit to this in public (at least to an audience of normal people, rather than the Microsoft BI geeks reading here), but over the last few years I’ve been obsessed with solving the ‘events in progress’ problem in SSAS and MDX. I’ve tackled it successfully in PowerPivot and DAX (see here and here) but I always thought there was no good solution in SSAS, and certainly other people such as Richard Tkachuk who have written about the same issue have come to a similar conclusion. But… last week I found one! So in this blog post I’ll outline the basic approach, and in my next post I’ll detail an even better, if more complex to implement, solution.
The two blog posts linked to above describe the problem in detail, but here’s a quick overview. In the Internet Sales fact table in Adventure Works each row represents a line item on an invoice, and each line item has an Order Date (ie the date the order was received) and a Ship Date (ie the date the order was shipped to the customer). We would like to know, on any given date, how many invoice line items are part of orders that have been received but not yet shipped and so have an Order Date before that date and a Ship Date after that date.
The key to being able to solve this problem in SSAS and MDX is how you model the data – my previous difficulties were down to the fact that I had the obvious way of modelling it, with one measure group having regular relationships with an Order Date and a Ship Date dimension, fixed in my head. However if you take a completely different approach the problem becomes easy, and here’s my worked solution using the Adventure Works data:
First of all I created my cube with one Date dimension and two measure groups, both based on the Internet Sales fact table:
Since BIDS won’t let you build two measure groups from the same table in the DSV, I created a named query that duplicated the Internet Sales fact table and used that as the basis of the Ship Dates measure group. Both the Order Dates Count and Ship Dates Count have AggregateFunction set to Count, and the Date dimension joins to Order Dates on OrderDateKey and Ship Dates on ShipDateKey.
I then created the following calculated measures:
CREATE MEMBER CURRENTCUBE.MEASURES.OrderedToDate AS
([Measures].[Order Dates Count]));
CREATE MEMBER CURRENTCUBE.MEASURES.ShippedToPrevDate AS
([Measures].[Ship Dates Count]));
CREATE MEMBER CURRENTCUBE.MEASURES.OrderedNotShipped AS
MEASURES.OrderedToDate – MEASURES.ShippedToPrevDate;
The first, OrderedToDate, gives the total number of line items ordered from the beginning of time up to the current date. The second, ShippedToPrevDate, gives the number of line items shipped from the beginning of time up to the date before the current date. If we subtract the second value from the first, as we do in the OrderedNotShipped measure, we can therefore find the number of line items have been ordered but not yet shipped on any given date:
And it’s pretty fast, too – certainly much faster than any other solution I’ve tried before in SSAS. There’s one obvious catch here though, and that is that you have to load the data from your fact table into your cube twice. What if you have a really large fact table, and doubling your processing time and storage in this way is not an option? Well, stay tuned for part two when I’ll show you a trick to avoid doing this and only use one measure group.