Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Optimising NON EMPTY filtering in queries

leave a comment »

Recently, on a few engagements, I’ve found myself optimising queries which do non-empty filtering and thought I’d share some tips (apologies if some of these seem a bit obvious to the more advanced MDXers among you). This will probably turn into an ongoing series – I’ve got some other ideas that need a bit more testing/research before I write them up…
 
The first thing to do is to set the NON_EMPTY_BEHAVIOR property on your calculated members, if you can. Take the following query on the Adventure Works cube, which runs in around 1 minute 20 seconds on my laptop:

WITH MEMBER MEASURES.DEMO AS [Measures].[Internet Sales Amount] * 2
SELECT {[Measures].[Internet Sales Amount], MEASURES.DEMO} ON 0,
NON EMPTY
[Product].[Category].[Category].MEMBERS
*
[Product].[Subcategory].[Subcategory].MEMBERS
*
[Product].[Product].[Product].MEMBERS
*
[Customer].[Postal Code].[Postal Code].MEMBERS
ON
1
FROM [Adventure Works]
WHERE([Date].[Calendar Year].&[2004])

Simply by adding the NON_EMPTY_BEHAVIOR property to the calculated member, so:

WITH MEMBER MEASURES.DEMO AS [Measures].[Internet Sales Amount] * 2
,NON_EMPTY_BEHAVIOR=[Measures].[Internet Sales Amount]
SELECT {[Measures].[Internet Sales Amount], MEASURES.DEMO} ON 0,
NON EMPTY
[Product].[Category].[Category].MEMBERS
*
[Product].[Subcategory].[Subcategory].
MEMBERS
*
[Product].[Product].[Product].
MEMBERS
*
[Customer].[Postal Code].[Postal Code].
MEMBERS
ON
1
FROM [Adventure Works]
WHERE([Date].[Calendar Year].&[2004])

The query returns the same results in just 6 seconds. Analysis Services can do non-empty filtering very efficiently on regular measures, but has to use a more time-consuming algorithm on calculated measures which don’t have NON_EMPTY_BEHAVIOR set; this property just tells AS that it can use the more efficient algorithm when doing non-empty filtering on the calculated measure and treat it eaxactly the same as the given regular measure. Clearly, we can say that whenever [Measures].[Internet Sales Amount] is null then [Measures].[Internet Sales Amount]*2 will be null.

But what about calculated measures where we can’t set NON_EMPTY_BEHAVIOR? Take the following query, similar to the one we’ve just been looking at, which again executes in 1 minute 20 seconds on a cold cache:

WITH MEMBER MEASURES.DEMO AS ([Measures].[Internet Sales Amount], [Date].[Calendar Year].PREVMEMBER)SELECT {[Measures].[Internet Sales Amount], MEASURES.DEMO} ON 0,
NON EMPTY
[Product].[Category].[Category].MEMBERS
*
[Product].[Subcategory].[Subcategory].MEMBERS
*
[Product].[Product].[Product].MEMBERS
*
[Customer].[Postal Code].[Postal Code].MEMBERS
ON
1
FROM
[Adventure Works]
WHERE([Date].[Calendar Year].&[2004])

The first thing to note is that if you can do away with the calculated measure altogether, then do so. The following query returns the same values but by crossjoining the years we’re interested in with [Measures].[Internet Sales Amount]:

SELECT

{[Measures].[Internet Sales Amount]}
*
{[Date].[Calendar Year].&[2004], [Date].[Calendar Year].&[2003]}
ON 0,
NON EMPTY
[Product].[Category].[Category].MEMBERS
*
[Product].[Subcategory].[Subcategory].MEMBERS
*
[Product].[Product].[Product].MEMBERS
*
[Customer].[Postal Code].[Postal Code].MEMBERS
ON
1
FROM
[Adventure Works]

If you’re generating the entire query dynamically, then this might be more work but is the cleanest option. But often we can’t get rid of the calculated measure, or don’t want to for reasons of simplicity. We can’t set NON_EMPTY_BEHAVIOR on the calculated measure on our calc because we can’t say that it will be null whenever [Measures].[Internet Sales Amount] or any other regular measure will be null but we can do something similar. The following query returns the same results as the preceding query but again brings the performance back down to 6 seconds

WITH

MEMBER MEASURES.DEMO AS
([Measures].[Internet Sales Amount], [Date].[Calendar Year].PREVMEMBER)
SELECT {[Measures].[Internet Sales Amount], MEASURES.DEMO} ON 0,
NON EMPTY
NONEMPTY(
[Product].[Category].[Category].MEMBERS
*
[Product].[Subcategory].[Subcategory].MEMBERS
*
[Product].[Product].[Product].MEMBERS
*
[Customer].[Postal Code].[Postal Code].MEMBERS
, ([Measures].[Internet Sales Amount],[Date].[Calendar Year].[All Periods]) )
ON 1
FROM
[Adventure Works]
WHERE([Date].[Calendar Year].&[2004])

What we’re doing here is a two-step filtering operation: first remove the tuples on rows which would be empty for all years using the NONEMPTY function (this allows us to use the more efficient algorithm again, because we’re filtering by a regular measure) and then getting rid of the empty rows that are still left using NON EMPTY.

Written by Chris Webb

August 13, 2006 at 9:21 am

Posted in MDX

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,146 other followers

%d bloggers like this: