Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Arbitrary-shaped sets and the Storage Engine cache

with 3 comments

Here’s a companion-piece to my post last week about query context and the formula engine cache – another scenario where you can easily stop caching taking place without knowing it, which has already been documented (although there is at least one important point to add) but again needs more visibility. This time the problem is that when you write an MDX query with an arbitrary-shaped set in the Where clause it stops Analysis Services using the storage engine cache. Queries that suffer from this will always read from disk and always perform as well or as badly as they did the first time they were run – so if cold cache performance is a problem for you, then this is an issue you need to understand and avoid.

Rather than repeat the information, let me direct you to the blog entry where I first found out about this problem, on Thomas Keyser’s blog from 2006:
http://schastar.spaces.live.com/blog/cns!12BCB785A5D8B3D4!135.entry

I can confirm that everything he says is still relevant on SSAS2008 except for the last query, where he has the whole of the Product.[Product Categories] hierarchy in the Where clause – run it twice and the second time you run it you’ll see it does hit the storage engine cache. One other point I picked up on Mosha’s MDX seminar in November is that it is possible for Analysis Services to think a set is arbitrary-shaped when it really isn’t. Take the following query:

SELECT [Measures].[Internet Sales Amount] ON 0,
[Date].[Calendar Year].MEMBERS ON 1
FROM [Adventure Works]
WHERE
({[Product].[Category].&[1], [Product].[Category].&[3]}
,[Customer].[Country].&[Australia])

This does not have an arbitrary-shaped set in the Where clause, and as a result the second time you run it you’ll see it hit the storage engine cache. However, if you rewrite the query so you have a set of tuples in the Where clause as follows:

SELECT [Measures].[Internet Sales Amount] ON 0,
[Date].[Calendar Year].MEMBERS ON 1
FROM [Adventure Works]
WHERE
({([Customer].[Country].&[Australia], [Product].[Category].&[1]),
([Customer].[Country].&[Australia], [Product].[Category].&[3])})

Even though you might think this query is equivalent to the first one, you’ll see that it does not use the storage engine cache.

What can we do about this then? Not a lot with most client tools; I’ve not checked, but I’d be surprised if any of them generated their MDX to avoid this situation. If your users frequently use certain arbitrary-shaped sets the only thing you could maybe do is hack your dimension data to make them non-arbitrary – but that would almost certainly end up being a bad compromise; otherwise you’d just have to build aggregations to make cold cache queries fast.

However, if you’re using SSRS then of course you can rewrite the MDX yourself. Let’s build a quick report on AdventureWorks that displays this problem:

image

As you can see, I’ve got a multiselect parameter on the slicer that has a default selection of members from two different levels from [Product].[Product Categories] – an arbitrary shaped set. Here’s the MDX that gets generated:

SELECT
NON EMPTY
{ [Measures].[Internet Sales Amount] }
ON COLUMNS,
NON EMPTY
{ ([Date].[Calendar Year].[Calendar Year].ALLMEMBERS ) }
DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME
ON ROWS
FROM (
SELECT ( STRTOSET(@ProductProductCategories, CONSTRAINED) )
ON COLUMNS FROM [Adventure Works])
WHERE (
IIF( STRTOSET(@ProductProductCategories, CONSTRAINED).Count = 1,
STRTOSET(@ProductProductCategories, CONSTRAINED),
[Product].[Product Categories].currentmember ) )
CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE,
FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS

And here’s how I would rewrite it:

SELECT
{ [Measures].[Internet Sales Amount] }
ON COLUMNS,
NON EMPTY
{ ([Date].[Calendar Year].[Calendar Year].ALLMEMBERS ) }
ON ROWS
FROM
[Adventure Works]
WHERE(
DESCENDANTS(
STRTOSET(@ProductProductCategories, CONSTRAINED),
[Product].[Product Categories].LEVELS(
MAX(STRTOSET(@ProductProductCategories, CONSTRAINED),
[Product].[Product Categories].CURRENTMEMBER.LEVEL.ORDINAL))
)
)
CELL PROPERTIES VALUE

What I’ve done here is:

  • Get rid of the completely useless Non Empty on columns (Why is it there? We don’t even want to remove empty columns – surely doing that would break the report?)
  • Removed the subselect and used a Where clause instead to do the slicing, so if we needed it we could use the formula engine cache (see here for why).
  • Removed any cell or dimension properties that aren’t used by the report (see here for why, although it’s only relevant for really big reports).
  • Used an expression in the Where clause to find the descendants of all members in the parameter set at the level of the lowest member of the set, using the Descendants, Levels and Max functions. This I think will turn all arbitrary-shaped set selections into non-arbitrary-shaped sets.

Written by Chris Webb

February 4, 2009 at 11:56 am

Posted in MDX

3 Responses

Subscribe to comments with RSS.

  1. nice post, Chris! –thx, Cos

    cosmini

    February 4, 2009 at 4:22 pm

  2. [...] 1) Microsoft SQL Server 2008 Analysis Services Unleashed (Book) 2) SSAS 2008 Performance Guide (Whitepaper) 3) Identifying and Resolving MDX Bottlenecks (Whitepaper) 4) Blogs by Mosha Pasumansky particularly       a) Inside OLAP Engine: SE cache sharing       b) Inside OLAP Engine: Cache Prefetching       c) Take advantage of FE caching to optimize MDX performance 5) Blogs by Chris Webb particularly       a) Formula Caching and Query Scope       b) Reporting Services-generated MDX, Subselects and Formula Caching       c) Missing Members and the Formula Engine Cache       d) Now() and the Formula Cache       e) Arbitrary-shaped sets and the Storage Engine cache [...]

  3. [...] CacheEntries: the number of entries in the measure group cache after the subcube request has been executed. This value will continue to increment and only be reset back to zero after the measure group cache has been cleared (either as a result of an XMLA clear cache or some other event, like processing, that clears the cache). The execution of the subcube request will itself usually result in an entry being made into the cache, so it’s unlikely you’ll see this value showing as zero. If you don’t see this value incrementing you might be running into this issue. [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,309 other followers

%d bloggers like this: