Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Time Intelligence Without Date Columns in DAX

with 5 comments

One of the things that annoys me a little bit in DAX is the way all the built-in time intelligence functionality relies so much on having columns of type Date. What happens if you don’t have a Date column to use? Many Time dimension tables that I see in data warehouses don’t: they use integers to represent dates instead, often in the form YYYYMMDD (eg so 20100525 would represent the 25th of May 2010). And what happens if you need to something time-intelligence-like on a non time dimension? For example you might have a dimension representing different budgets, and you might want to show the difference between the current budget and the previous budget. There are no equivalents of the .PrevMember or .NextMember, or any hierarchical functions, because there are no hierarchies in PowerPivot.

Anyway, after some thought (and a lot of help from Tomislav and Jeffrey Wang from the dev team) I’ve found out it is possible to create time-intelligence calculations without dates. Let’s take a simple PowerPivot model based on Adventure Works DW, with the DimDate table joined to FactInternetSales on the OrderDateKey column. Now since there’s a column of type Date on DimDate, FullDateAlternateKey, we can use the time intelligence functionality to create a calculation that returns the previous year’s Sales Amount as follows:

=CALCULATE(SUM(FactInternetSales[SalesAmount])
    , DATEADD(DimDate[FullDateAlternateKey], -1, YEAR)
    , ALL(DimDate))

Here’s what a query using this calculation looks like:

image

The above calculation works by returning the sum of SalesAmount for all the dates that are one year before the dates in the current context. So if you have the year 2003 on rows in your query, the for that cell the row context would contain all of the dates in the year 2003; these then would be shifted back one year and you’d get the sum of SalesAmount for all the dates in the year 2002. Incidentally, this explains why in Adventure Works you get strange values for the year 2004: it’s because in DimDate you only have dates from January 1st 2004 to August 31st 2004, so the calculation only returns the sum of SalesAmount for January 1st 2003 to August 31st 2003.

Now, what would happen if we didn’t have the FullDateAlternateKey column and we wanted to do the same calculation? What we need to do is recreate this algorithm and shift our dates back one year without using the DateAdd function. The way to do it relies on the format of the keys used in the DimDate[DateKey] column, the YYYYMMDD format I mentioned before. If we have a date key like 20100525 and we want to find the equivalent date in the previous year, all we need to do is subtract 10000 from that key value. In our expression, then, what we need to do is to get the set of dates in the current row context, and for each of those dates return the date that has the key value that is 10000 less than the current date, then sum up SalesAmount for those dates. Here’s the expression:

=CALCULATE(SUM(FactInternetSales[SalesAmount])
,FILTER(ALL(DimDate[DateKey])
, FILTER(VALUES(DimDate[DateKey]), DimDate[DateKey]-10000=EARLIER(DimDate[DateKey])))
, ALL(DimDate))

Notice the use of two nested Filter statements here, and the way that the Earlier function is used to retrieve the value of DimDate[DateKey] from the outer row context. The outer Filter gets the set of all values from DimDate[DateKey], regardless of year or any other selection made, using the All function; the inner Filter filters this set so it only contains the keys that are 10000 less than the dates that have currently been selected.

It works in almost the same way as the original calculation, although I’ve noticed a few differences. First of all, the grand total for the new calculation displays the total of the values for each year, in a way that the original version does not:

image

Secondly, on February 29th in a leap year, the DateAdd function returns February 28th in the previous year and our new approach (unsurprisingly) does not:

image

It’s probably a matter of taste what should happen here; either a null or the value from February 28th in the previous year make sense to me. Note, though, that as shown in the last but one screenshot the year totals for 2004 for both calculations are identical, so the value for February 28th is not counted twice.

Thirdly, keeping dates on rows and measures on columns, when you add English Month Name to a slicer and choose one month, the new expression works ok but you get the following error from the original expression:

ERROR – CALCULATION ABORTED: MdxScript(Sandbox) (22, 4) Function ‘DATEADD’ only works with contiguous date selections.

image

This particular error will be the subject of my next blog post, but in this case I think the DAX time intelligence functions are behaving a bit rubbishly when they should be returning results.

Written by Chris Webb

May 26, 2010 at 10:09 pm

Posted in DAX

5 Responses

Subscribe to comments with RSS.

  1. This is an important article. Looking at the differences you mentioned, I can conclude nothing but that your calculation is superior. History repeats itself with built-in time-intelligence, wouldn\’t you say? .-)

    Tomislav

    May 28, 2010 at 12:18 pm

  2. I wouldn\’t say it\’s better – the thing I didn\’t test was performance. I don\’t know how my approach compares to the DateAdd approach, and I wouldn\’t be surprised if using DateAdd was faster.

    Chris

    May 28, 2010 at 2:44 pm

  3. Yes, that crossed my mind. And you\’re right, it should be slower. But then again, we\’re talking about time dimension here (Values() and Earlier() refer to that). How many members must we have there to notice significant performance issues?Setting the context is probably faster with built-in approach, should be. I remember my experiments with Earlier() function were several times slower than the same calculations made different way, without it. Which means I\’ve waited for a second or two in the worst case. However, I used a fact table in my calculations, not a small time table. Nevertheless, the performance issue would be interesting to measure. Do you plan to perform tests?In contrast, while I might agree that the second difference is ok in both ways, differences number 1 and 3 signal questionable total and inability to slice the data. Weighting that against an unknown decrease in performance in already fast Vertipaq is a clear case for me. But then again, it wouldn\’t be the first time that I\’m wrong :-).

    Tomislav

    May 29, 2010 at 1:56 am

  4. Performance tests would be a good idea – which then raises the question of how to test the performance of a DAX calculation. I guess you\’d need to use the built-in trace functionality, and maybe run the query as soon as you opened the workbook to ensure it runs on a cold cache… I\’ll ask someone.Anyway, I agree that issue #3 is a big problem in my opinion, and is definitely something that should be fixed in a service pack.

    Chris

    May 29, 2010 at 9:49 pm

  5. One thing you could also do in this particular case is add a column with the following formula: =DATEVALUE(FORMAT(VALUE(DimDate[DateKey]),"0000-00-00")) and then use DATEADD(). I haven\’t tested the performance impact though.

    Kirill

    September 4, 2010 at 11:59 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,302 other followers

%d bloggers like this: