Optimising MDX Calculations With The Unorder() Function

The Unorder() function is probably one of the least used functions in the whole of MDX. It exists only as a query performance hint and, since I had never up to now found a scenario where it did improve the performance of a calculation I had pretty much forgotten about it (as Books Online says, the optimisation it performs is applied automatically in many cases). However I was playing around with some calculations last week and found out that it does have its uses…

What does the Unorder() function do? It’s a function that takes a set and returns a set, and what it does is remove any implicit ordering from that set. By default all sets in MDX are ordered, but for some types of operation that ordering is unimportant and ignoring it can result in faster query performance.

Take, for example, the following query on the Adventure Works cube which shows the number of customers who have bought something up to the current date:

WITH

MEMBER MEASURES.CUSTOMERSTODATE AS

COUNT(

    NONEMPTY(

        [Customer].[Customer].[Customer].MEMBERS

    , {[Measures].[Internet Sales Amount]}

      *

      {NULL : [Date].[Calendar].CURRENTMEMBER})

)

 

SELECT

{MEASURES.CUSTOMERSTODATE}

ON COLUMNS,

[Date].[Calendar].[Date].MEMBERS

ON ROWS

FROM

[Adventure Works]

 

On my laptop it executes in 35 seconds on a cold cache. We can optimise the calculation here simply by wrapping the set of all members on the Customer level of the Customer hierarchy with the Unorder() function, so:

UNORDER([Customer].[Customer].[Customer].MEMBERS)

The following query now executes in 27 seconds on a cold cache:

WITH

MEMBER MEASURES.CUSTOMERSTODATE AS

COUNT(

    NONEMPTY(

        UNORDER([Customer].[Customer].[Customer].MEMBERS)

    , {[Measures].[Internet Sales Amount]}

      *

      {NULL : [Date].[Calendar].CURRENTMEMBER})

)

 

SELECT

{MEASURES.CUSTOMERSTODATE}

ON COLUMNS,

[Date].[Calendar].[Date].MEMBERS

ON ROWS

FROM

[Adventure Works]

 

As far as I can tell, Unorder() only makes a difference on calculations when used in combination with NonEmpty(), and when it is used over a large set (here the set of customers has around 18000 members). If you have calculations like this I would recommend testing to see if Unorder() makes a difference – if it does, please leave a comment and let me know what you find!

7 thoughts on “Optimising MDX Calculations With The Unorder() Function

  1. I had the chance to test this optimisation today on a MOLAP dimension with 21 million members on the key attribute. On a fairly simple query with the calculation
    count(existing mydim.myhier.mylevel.members)
    the execution time was 180 seconds; changing the calculation to
    count(existing unorder(mydim.myhier.mylevel.members))
    lowered the execution time to 30 seconds.

  2. Chris,

    I tried experimenting on this approach to see if it could at least help alleviate the slow response from SSAS MD when dealing with DistinctCount measures and the cold cache performance was far worse than using a DISTINCTCOUNT measure. My AdventureWorks Internet and Reseller Sales fact each had 35+ Million records. I tested with the Order Count measure (DISTINCTCOUNT) against a COUNT around the Order Number Attribute dimension members and even after adding the UNORDER function around the set, the query just never completed after 5 minutes. The DISTINCTCOUNT on OrderNumber on the fact table completed in a minute and a dime. My AdventureWorks fact tables have 6+ Million distinct order numbers. I used this to test a more real life scenarios for a larger company. I guess there’s no silver bullet for DISTINCTCOUNT for SSAS MD.

    1. Hi James,

      I wouldn’t expect MDX to ever be faster than the built-in distinct count measure aggregation type. Unfortunately distinct count will always be slow in SSAS MD; you have read the CAT team’s white paper on performance tuning distinct count though, haven’t you?

      Chris

      1. Yes the SSD approach but I haven’t had a chance to test it on a large scale. I have tested the approach of Hashing the DISTINCTCOUNT key field and partitioning accordingly as explained in an earlier whitepaper (http://www.microsoft.com/en-us/download/details.aspx?id=891) on a larger scale SSAS implementation. This did slightly improve the performance of the DISTINCTCOUNT aggregation if well partitioned but I must say that it heavily complicates ETL. If more customers would embrace the value of adding SSDs into their infrastructure, it may help alleviate the problem. Alternatively, Microsoft can come out with a version of Tabular that doesn’t drop all the nice functionality the MD one has or force us to use not so friendly workarounds while exploiting its super fast DISTINCTCOUNT native function.

Leave a Reply