Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Calculating Ages In MDX

with 5 comments

One area where I see people have a lot of problems when they’re learning MDX is when they are working with dates. SQL has a lot of great functionality for working with dates while MDX, to be honest, does not – therefore people tie themselves in knots writing complex MDX expressions (which often don’t perform all that well) to recreate what they would do in SQL. However, as is often the case with MDX, approaching the questions from a completely different angle yields much better results.

For example, let’s imagine that you wanted to calculate the age in years of any customer on any given date. Using the Adventure Works cube you have a Date dimension, which will be how you want to select the date that the age is to be calculated at; there is also, on the Customer dimension, an attribute called Birth Date that gives you each customer’s date of birth. You need a calculated measure that calculates the age of each customer for any date selected on the Date dimension.

If you think about this as a SQL date problem, you’ll probably want to use functions like DateDiff() – one of the VBA functions that is available in MDX (but is not native MDX). The following article shows how DateDiff() and other VBA functions can be used to calculate ages in VBA code:
http://msdn.microsoft.com/en-us/library/aa227466(v=vs.60).aspx

Here’s a query showing how this calculation can be implemented in MDX:

with

--calculate the difference between the years of the 

--current date and the customer's birth date

member measures.yeardiff as

DateDiff("yyyy",

[Customer].[Customer].currentmember.properties("Birth Date", typed),

[Date].[Date].currentmember.member_value

)

--calculate a value which is the month number

--multiplied by 100 plus the day number of the month

--for the current date

member measures.datemonthday as

(month([Date].[Date].currentmember.member_value) * 100)

+

day([Date].[Date].currentmember.member_value)

--calculate a value which is the month number

--multiplied by 100 plus the day number of the month

--for the customer birth date

member measures.birthmonthday as

(month([Customer].[Customer].currentmember.properties("Birth Date", typed)) * 100)

+

day([Customer].[Customer].currentmember.properties("Birth Date", typed))

 

--calculate customer age as the difference in years

--minus 1 if the customer's birthday this year is

--after the current date

member measures.customerage as

measures.yeardiff -

iif(measures.datemonthday>=measures.birthmonthday,0,1)

 

select

--return all dates in 2002 on columns

descendants(

[Date].[Calendar].[Calendar Year].&[2002],

[Date].[Calendar].[Date])

on 0,

--return all customers in Coff's Harbour on rows

descendants(

[Customer].[Customer Geography].[City].&[Coffs Harbour]&[NSW],

[Customer].[Customer Geography].[Customer])

on 1

from

[Adventure Works]

where(measures.customerage)

The output of the query is this:

image

On my laptop this query, which returns 365 columns and 106 rows, returns in around 3.5 seconds. A few things to point out:

  • I’m using the .Member_Value function instead of the .MemberValue function to get the typed date value associated with each date, as Bill Anton’s recent blog post points out that it is marginally faster.
  • I’m using .Properties(“Birth Date”, typed) to get the typed date value for each customer’s birth date. If I set the ValueColumn property on the Customer attribute of the Customer dimension, or enabled the hierarchy on the Birth Date attribute, I would be able to use .Member_Value here too and get a very minor performance improvement.
  • Most of time spent by the query is taken by the yeardiff calculated member, which uses the DateDiff() function. This isn’t surprising because calling any VBA function in MDX is expensive and something to be avoided.

Performance here is ok, but we can do better. The most important thing to do when tuning any MDX calculation is to try to pre-calculate as much as possible. In this case there are two calculations we can move out of MDX into the structure of the cube:

  • The DateDiff() function here just calculates the difference between the years of the two dates. Therefore if you create a new attribute on the Customer dimension to hold the integer representing the year of the customer’s birth date, and use the existing Calendar Year attribute on the Date dimension, you can avoid using DateDiff() completely and do a simple subtraction.
  • Likewise, rather than calculating the combination of month and date in the MDX, these values can be pre-calculated in the dimension and then accessed as properties.

Here’s the rewritten version of the calculation, assuming that the following new attributes have been added to the Customer and Date dimensions:

  • [Customer].[Birth Year] holds the year of the customer’s birth as an integer
  • [Customer].[Birth Month Day] is an integer containing the month number multiplied by 100 plus the day number of the month for the customer’s birth date. For example the date 5th June would have the value 605.
  • [Date].[Month Day] is the same as [Customer].[Birth Month Day] but for the Date dimension.
with

member measures.yeardiff as

[Date].[Calendar Year].currentmember.member_value

-

[Customer].[Birth Year].currentmember.member_value

 

member measures.customerage as

measures.yeardiff

-

iif(

[Date].[Month Day].currentmember.member_value

>=

[Customer].[Birth Month Day].currentmember.member_value

,0,1)

 

select

descendants(

[Date].[Calendar].[Calendar Year].&[2002],

[Date].[Calendar].[Date])

on 0,

descendants(

[Customer].[Customer Geography].[City].&[Coffs Harbour]&[NSW],

[Customer].[Customer Geography].[Customer])

on 1

from

[Adventure Works]

where(measures.customerage)

 

This query runs in around 1.2 seconds, a significant improvement. Of course if you were to create a completely new fact table that contained the age of every customer on every date in the Date dimension then you wouldn’t need to use any calculated measures and performance would be even better, but that may not be practical if you have a lot of customers and a lot of dates.

Written by Chris Webb

June 9, 2014 at 9:15 am

5 Responses

Subscribe to comments with RSS.

  1. Another great post!

    Chris, I’m thinking a hybrid approach that is to populate fact table to certain range of date (let’s say populate the fact for 100 years per customer). Then create a calculated measure with the expression such as:
    IIF(Not IsEmpty(

    )
    	, 
    	, 
    	)

    George Qiao

    June 9, 2014 at 1:39 pm

    • IIF(Not IsEmpty([pre-calculated measure])
      , [pre-calculated measure]
      , [on the fly calculation]
      )

      George Qiao

      June 9, 2014 at 1:42 pm

      • Yes, that might work well, but if you can do the calculation quickly in MDX there’s probably no need to build a large fact table like this.

        Chris Webb

        June 9, 2014 at 9:08 pm

  2. I’m not sure about performace in this code…

    SAP_master

    June 25, 2014 at 7:52 pm

  3. Hi Chris, I’ve made a small extension to your code by making it level aware:

    http://101mdx.blogspot.com/2014/08/calculating-age.html

    John Minkjan

    August 22, 2014 at 10:30 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,144 other followers

%d bloggers like this: