Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Solving the ‘Events in Progress’ problem in MDX, Part 1

with 29 comments

I wouldn’t admit to this in public (at least to an audience of normal people, rather than the Microsoft BI geeks reading here), but over the last few years I’ve been obsessed with solving the ‘events in progress’ problem in SSAS and MDX. I’ve tackled it successfully in PowerPivot and DAX (see here and here) but I always thought there was no good solution in SSAS, and certainly other people such as Richard Tkachuk who have written about the same issue have come to a similar conclusion. But… last week I found one! So in this blog post I’ll outline the basic approach, and in my next post I’ll detail an even better, if more complex to implement, solution.

The two blog posts linked to above describe the problem in detail, but here’s a quick overview. In the Internet Sales fact table in Adventure Works each row represents a line item on an invoice, and each line item has an Order Date (ie the date the order was received) and a Ship Date (ie the date the order was shipped to the customer). We would like to know, on any given date, how many invoice line items are part of orders that have been received but not yet shipped and so have an Order Date before that date and a Ship Date after that date.

The key to being able to solve this problem in SSAS and MDX is how you model the data – my previous difficulties were down to the fact that I had the obvious way of modelling it, with one measure group having regular relationships with an Order Date and a Ship Date dimension, fixed in my head. However if you take a completely different approach the problem becomes easy, and here’s my worked solution using the Adventure Works data:

First of all I created my cube with one Date dimension and two measure groups, both based on the Internet Sales fact table:

image

image

Since BIDS won’t let you build two measure groups from the same table in the DSV, I created a named query that duplicated the Internet Sales fact table and used that as the basis of the Ship Dates measure group. Both the Order Dates Count and Ship Dates Count have AggregateFunction set to Count, and the Date dimension joins to Order Dates on OrderDateKey and Ship Dates on ShipDateKey.

I then created the following calculated measures:

CREATE MEMBER CURRENTCUBE.MEASURES.OrderedToDate AS
    SUM(NULL: [Date].[Calendar].CURRENTMEMBER,
        ([Measures].[Order Dates Count]));

CREATE MEMBER CURRENTCUBE.MEASURES.ShippedToPrevDate AS
    SUM(NULL: [Date].[Calendar].CURRENTMEMBER.PREVMEMBER,
        ([Measures].[Ship Dates Count]));

CREATE MEMBER CURRENTCUBE.MEASURES.OrderedNotShipped AS
MEASURES.OrderedToDate – MEASURES.ShippedToPrevDate;

The first, OrderedToDate, gives the total number of line items ordered from the beginning of time up to the current date. The second, ShippedToPrevDate, gives the number of line items shipped from the beginning of time up to the date before the current date. If we subtract the second value from the first, as we do in the OrderedNotShipped measure, we can therefore find the number of line items have been ordered but not yet shipped on any given date:

image

image

And it’s pretty fast, too – certainly much faster than any other solution I’ve tried before in SSAS. There’s one obvious catch here though, and that is that you have to load the data from your fact table into your cube twice. What if you have a really large fact table, and doubling your processing time and storage in this way is not an option? Well, stay tuned for part two when I’ll show you a trick to avoid doing this and only use one measure group.

Written by Chris Webb

January 21, 2011 at 12:00 am

Posted in Analysis Services, MDX

29 Responses

Subscribe to comments with RSS.

  1. Well done!
    After all, also you will join us in the “Model first” mantra…

    Marco Russo

    January 21, 2011 at 12:07 am

  2. When facing a similar problem concerning student enrollment counts based on a start and end date, we were forced to pre-aggrigate on each possible day in a cube to accomplish this. Our approach really bloated the cube, I’m thinking this could really help with some performance issues, Thanks and looking forward to part 2!

    Trev Swarm

    January 21, 2011 at 2:31 am

  3. Nice. Lots of problems get solved eeasier in the model than in MDX :)

    Can you add one more dimension with two members – Shipped and Ordered and add one extra row for each order – when it ships? Then you can have one date dimension too. Of course you get more fact rows, but less measures and dimensions (dimensions as opposed to the original design). Just thinking here – haven’t actually tried it….

    Boyan Penev

    January 21, 2011 at 6:42 am

  4. [...] my last post I described a simple solution to the ‘events in progress’ problem in SSAS and MDX, which [...]

  5. [...] solution is a variation on the ‘events in progress’ model that I blogged about recently but with an interesting twist that I thought was worth describing. First of all, we [...]

  6. I have been trying to use this solution as model for a problem I am having. I have a table with transactions that have start and end dates and a customer ID. I want to know how many distinct customers are active during any time period. I can use your really cool technique to know how many transactions are active, but the distinct customer part is throwing me for a loop. Any ideas?

    Brad G

    May 5, 2011 at 9:42 pm

    • Actually, this should be much easier than the technique I describe here. You just need to create a distinct count measure on your customer id column, and then in the where clause of your query include a set containing the date range you want to look at. Here’s an example from Adventure Works:

      select {[Measures].[Customer Count]} on 0,
      [Product].[Category].members on 1
      from
      [Adventure Works]
      where({[Date].[Calendar Year].&[2002]:[Date].[Calendar Year].&[2004]})

      Chris Webb

      May 5, 2011 at 9:55 pm

      • I was wanting the user to be able to pivot (using Excel or something similar) using any part of the date hierarchy.

        Brad G

        May 5, 2011 at 9:59 pm

      • You can always put your date hierarchy on the filter and multiselect the date range

        Chris Webb

        May 5, 2011 at 10:02 pm

  7. Sorry for cluttering up your blog. I guess I am just missing a step. I was looking to your solution here to figure out how many of those transactions were active across multiple periods (break down by months or quarters). It works great to know how many transactions are active, but I want to know how many of my customers have active transactions I get lost. I do not want to double count customers who have more than one transaction active during any period. Thanks for all the quick replies. I really enjoy this blog.

    Brad G

    May 5, 2011 at 10:07 pm

    • No problem. They key is to use a distinct count measure – you don’t need a second measure group or anything else – and that will avoid double counting customers.

      Chris Webb

      May 5, 2011 at 10:13 pm

  8. Let me ask it this way and see if you give the same answer. In your blog example, you can quickly figure out how many items have been ordered and not sent. How would you modify it to find out how many customers have items that have been ordered but not sent?

    Thanks again for the responses. I feel like I should buy a second copy of your books now :)

    Brad G

    May 5, 2011 at 11:03 pm

    • Ah, I see. Unfortunately I don’t have an answer though… I’ll think about it!

      Chris Webb

      May 5, 2011 at 11:07 pm

      • Hi Chris,
        Great approach to counting events-in-progress. I have the same situation as Brad G. I need to do a distinct count and show it in the Excel based on StartDate and End Date. Using the MDX query and specifying the range in the Where clause works well and is fast, but how to create a calculated member?

        Brad, did you solve your problem? Appreciate a response.
        Artur Anbild

        Artur Anbild

        December 15, 2011 at 9:07 pm

      • Hi Chris, I have the same problem as Brad. The distinct count won’t work with this approach. Is there a good way to handle it yet? Thanks a lot.

        Peter Y

        March 29, 2013 at 8:29 pm

      • Sorry, I haven’t got round to thinking about it properly – it’s not straightforward, for sure. You could certainly do it with sets similar to the approach here: http://cwebbbi.wordpress.com/2010/10/08/counting-new-and-returning-customers-in-mdx/ but if you have a lot of customers, performance might not be great.

        Chris Webb

        March 29, 2013 at 10:12 pm

  9. I had a problem with LinkMember function applied on role playing dimensions(Date) when that role playing dimension was related with another dimension by many-to-many relationship.Besides performance issues related with the LinkMember function the results were not accurate.
    So,I decided create a solution without using LinkMember based on yours solution with a little trick, instead of creating a named query I create a distinct count measure on the same fact table wich automatically creates a new measure group and after i changed the aggregation to Count.
    The solution stays more cleaner, but obviously the catch remains.
    Meanwhile, I read the Part II and i’m considering try it.

    Thanks

    jorg

    September 1, 2011 at 6:24 pm

  10. [...] basic approach is similar to the one I describe here. Using the same Adventure Works data, I can load the DimDate and FactInternetSales tables into [...]

  11. What a nice design trick, it looks exactly what we need for our student enrollment problem. We tried to put deltas into the fact table and that worked fine for the counts, but the dimensions could not be properly utilized so we couldn’t slice properly. This looks like it will solve that problem. Thanks for the article!

    Zoran Stanisic

    December 10, 2011 at 11:37 pm

  12. how are the measures order dates count and sales dates count created? what aggragate, table and attributes are being used

    SG

    April 6, 2012 at 8:31 am

    • The details are in the post: they’re created from the FactInternetSales table with aggregate function Count.

      Chris Webb

      April 6, 2012 at 9:13 pm

  13. […] take the events-in-progress problem, one that I have blogged about many times in the past (see here, here, here and here for example) and see how we can solve it in Power Query. It’s a very common […]

  14. […] take the events-in-progress problem, one that I have blogged about many times in the past (see here, here, here and here for example) and see how we can solve it in Power Query. It’s a very common […]

  15. […] if you were to do a total-to-date in MDX (similar to what I describe here) it would be pretty easy to calculate the outstanding amount on all invoices on any given date. […]

  16. […] if you were to do a total-to-date in MDX (similar to what I describe here) it would be pretty easy to calculate the outstanding amount on all invoices on any given date. […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,083 other followers

%d bloggers like this: