Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Solving the ‘Events in Progress’ Problem in DAX, Part Two: Aggregating Dates and Multiselect

with 9 comments

You may recall that by the end of last week’s post on solving the ‘events in progress’ problem in DAX, I’d managed to get the right values out of PowerPivot at the Date level with good performance. BUT when, for example, I selected a Calendar Year I got no values out, and I hadn’t even thought of how to handle situations such as when Date was on filter and the user had selected three different, non-consecutive dates. Well, that was last week when I was a DAX newbie – this week, well, I’m slightly less of a newbie and I’ve got the solution.

First, let’s restate the problem a bit more clearly. What I want is this: when I have a date on columns or rows, I want my calculated measure to display the number of rows in the FactInternetSales table that have an OrderDate less than or equal to that date, and have a ShipDate that is greater than or equal to that date. This way I see the number of rows in the table where something has been ordered but not yet shipped. When I select a year, I want to see the number of rows where that OrderDate-ShipDate range intersects with the dates in that year. So if I selected the year 2002, I want all the rows where either the OrderDate is in 2002, or the ShipDate is in 2002, or when the OrderDate is before 2002 and the ShipDate is after 2002. Finally if I make a multiselect when Date is on a filter, for example if I multiselect the 1st January 2002, the 5th of January 2002 and the 10th January 2002, I want to count the number of rows where the OrderDate-ShipDate range includes any of those three dates.

I spent several hours thinking how I could do this in DAX using just the columns I had and I came up with the following formula, which gave me the results I was expecting… when it ever returned, because it was horrendously slow:

=CALCULATE(
COUNTROWS(‘FactInternetSales’)
, FILTER(‘FactInternetSales’
, COUNTROWS(
FILTER(
VALUES(‘DimDate’[DateKey]),
(‘FactInternetSales’[OrderDateKey]<=’DimDate’[DateKey])
&&
(‘FactInternetSales’[ShipDateKey]>=’DimDate’[DateKey])
))>0))

It’s using the Calculate function to evaluate the expression CountRows(FactInternetSales); what it’s trying to do in the outermost filter is to filter all of FactInternetSales, and find all the rows where there is at least one Date in the current selection that is between the OrderDateKey and the ShipDateKey. But, as we learned last week, doing a filter on all the columns in a table is very inefficient, and what we actually want to do is to filter on just the columns we’re interested in: OrderDateKey and ShipDateKey. However what I found was that I really needed all of the distinct combinations of OrderDateKey and ShipDateKey, but I could only get either all of the columns in a table using the table name, or the distinct set of values in a single column using the Values() function. Therefore I couldn’t actually filter on the distinct combinations of OrderDateKey and ShipDateKey where OrderDateKey>=DateKey>=ShipDateKey. This seems like a limitation of DAX from what I can see, though it might turn out that there is a way of doing it – if there is I will of course blog about it.

Anyway, the solution was to create a new column that did contain the distinct combinations of OrderDateKey and ShipDateKey. I went into the PowerPivot designer and created a new column called OrderShip on FactInternetSales with the following expression:

=CONCATENATE(‘FactInternetSales’[OrderDateKey],’FactInternetSales’[ShipDateKey])

As you can see, it simply concatenates the OrderDateKey and ShipDateKey values into a 16 character string:

image

I was then able to use that column in a filter as follows, extracting the OrderDateKey and the ShipDateKey from the string using the Left and Right functions, and turning them back into numbers using the Value function:

=Calculate(
CountRows(FactInternetSales)
, FILTER(
VALUES(‘FactInternetSales’[OrderShip])
, COUNTROWS(
FILTER(
VALUES(‘DimDate’[DateKey]),
(‘DimDate’[DateKey]>=VALUE(LEFT(‘FactInternetSales’[OrderShip], 8)))
&&
(‘DimDate’[DateKey]<=VALUE(RIGHT(‘FactInternetSales’[OrderShip],8)))
))>0))

The logic is the same as with the previous expression, with the all-important exception that the outer Filter only works on the distinct values of OrderShip. It works and it’s fast, so I’m relatively happy although I would like to find a more elegant solution.

It not only gives the correct values for individual dates:

image

but also for Years (and other aggregations of Dates – also notice how the sum of the years is greater than the same as the Grand Total, because of course an order can be in progress in two different years and so needs to be counted in each):

image

and also for multiselect (where again the value is not the sum of the values of the three individual dates):

image

image

I think this illustrates one area where DAX is superior to MDX. MDX is notoriously bad at handling multiselect, but as Marius explained to me in DAX multiselect support is built-in: the Values() function is rather like a CurrentSet function, returning the set of all selected (‘existing’ in MDX terms) values in a column in a particular context.

So, another problem solved… time to think up the next challenge!

Written by Chris Webb

January 12, 2010 at 10:47 pm

Posted in DAX

9 Responses

Subscribe to comments with RSS.

  1. BTW I opened a Connect about the Values() function taking multiple columns as a parameter – vote here please:https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=524952

    Chris

    January 14, 2010 at 4:57 pm

  2. What is the equivalent of Excel Row() function in DAX.If I have a Data in PowerPivot and I just want to add a calculated column called say ID which is 1,2,3…as many as there are rows how can this be done with DAX

    sameer

    January 15, 2010 at 1:35 pm

  3. I guess there isn\’t a concept of Row() in DAX, for the same reason that there\’s no similar concept in relational databases – you can\’t have a row number until data is ordered in some way. What you can do, however, is find the rank of a row based on some particular ordering; this thread on the forums shows how:http://social.msdn.microsoft.com/Forums/en/sqlkjpowerpivotforexcel/thread/ab96f44a-4d0a-4ee5-9d05-0f99ea4e213b

    Chris

    January 18, 2010 at 11:04 am

  4. [...] problem in SSAS and MDX. I’ve tackled it successfully in PowerPivot and DAX (see here and here) but I always thought there was no good solution in SSAS, and certainly other people such as [...]

  5. [...] the 5th or 6th post I’ve written on this problem (most deal with MDX, but I did blog about solving it in DAX early last year) but what can I say – it’s an interesting problem! I came across it at work [...]

  6. Hi Chris,

    You assisted me in a previous incarnation of this in MDX, which worked very well as an aside! I’m trying to resolve the same thing in DAX now, with the scenario below to calculate Payment Days

    Measure – Outstanding Amount = 1000

    We then use a reverse running sum method to count back the days of payments taken to reach the current outstanding amount, eg:

    Date Outstanding Amount Payment Amount Payment Days
    1/1/2013 20
    2/1/2013 30
    3/1/2013 50 30 2
    4/1/2013 1000 40 3

    In the above example, the PaymentDays value on the 4/1/2013 would be 3, as it took 3 days to reach the outstanding amount of 1000 (40 + 30 + 30). On the 3/1/2013 the value would be 2, as it only took two days of payments to reach/exceed the value (30 + 30 = 60) of 50.

    Can the above solution be modified to work in DAX?

    Thanks

    Mike

    mike dobing

    February 12, 2013 at 4:10 pm

    • Hi Mike,

      It sounds doable, although I’d have to think about it! BTW, rather than use the approach described in this post for solving the events-in-progress problem in DAX, I now prefer the approach I wrote about here: http://cwebbbi.wordpress.com/2011/11/10/solving-the-events-in-progress-problem-in-dax-v2-0/

      Chris

      Chris Webb

      February 12, 2013 at 4:21 pm

      • Thanks Chris, well if you have any brain waves on your mental schedule let me know :-) Thanks for update on the new approach, I’ll check it out also.

        Just when I thought I was confident on MDX, along comes DAX and it all starts again!

        Cheers

        Mike

        mike dobing

        February 12, 2013 at 5:10 pm

  7. […] problem. I’ve blogged about it a lot of times, looking at solutions for MDX and DAX (see here and here), and for this project I had to do some performance tuning on a measure that uses a filter […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,867 other followers

%d bloggers like this: