Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Many-to-Many Relationships and Partition Slices

with 4 comments

As you’re probably aware one of the benefits of partitioning a measure group is that it can improve query performance: for example, if you have partitioned your measure group by year and you run a query that only requests data for a particular year, then SSAS should only query the partition that holds the data for the year you’ve requested. There are however several scenarios where SSAS will scan partitions you would not expect it to scan – the ones I knew about up to now are all listed in the “Unexpected Partition Scans” section in this excerpt from “Expert Cube Development” – and I’ve just come across another scenario where this happens.

I was working with a customer the other week that had implemented Dave Fackler’s approach to handling multiple calendars using many-to-many relationships, as detailed in these two blog posts:
http://davefackler.blogspot.com/2008/05/handling-multiple-calendars-with-m2m.html
http://davefackler.blogspot.com/2008/06/handling-multiple-calendars-with-m2m.html

Now there’s absolutely nothing wrong with the design that Dave describes here – it’s a classic application for many-to-many relationships – and there aren’t any better ways of meeting this requirement. However, what I noticed when looking in Profiler at the customer’s cube was that all of the queries I was writing were resulting in reads on all of the partitions in the main measure group, even when I expected only one partition should be hit, and this was slowing the queries down a lot.

You can recreate the same problem in the version of the Adventure Works that Dave posts on his blog. Take the following query that uses the regular Adventure Works Date dimension and does not use the m2m relationship:

select {[Measures].[Reseller Sales Amount]} on 0,
{[Date].[Calendar].[Month].&[2004]&[3]}
on 1
from [Adventure Works]

When run on a cold cache, in Profiler you can see that it only results in the Reseller Sales 2004 partition being hit in the Reseller Sales measure group:

image

This is exactly as you’d expect. However if you request the same month (March 2004) via the Calendar dimension, which is connected to the Reseller Sales measure group via a m2m relationship and the Date dimension, you see all the partitions on Reseller Sales are hit:

select {[Measures].[Reseller Sales Amount]} on 0,
{[Calendar].[Year-Quarter-Month].[Month].&[2000200409]}
on 1
from [Adventure Works]

image

Not so good. But it turns out that this is an unavoidable side-effect of using many-to-many relationships: a filter on a many-to-many dimension (in this case the Calendar dimension) does not result in a filter being applied on the intermediate dimension (the Date dimension) to the measure group (Reseller Sales). Greg Galloway came across this some time ago and filed an issue on Connect which confirms that this is by design.

Given that I doubt this behaviour will be changed any time soon, the takeaway is that when you’re planning your partitioning strategy you should think twice about partitioning using a dimension that is used as an intermediate dimension in a many-to-many relationship. Otherwise you will lose all of the performance benefits of partitioning when your queries use that many-to-many relationship…

Written by Chris Webb

January 24, 2012 at 8:46 pm

Posted in Analysis Services

4 Responses

Subscribe to comments with RSS.

  1. Did you consider denormalizing DimCalendar (many-to-many) dimension? it will definately not give as much performance improvement as partition filtering but could have been some relief.

    Parikshit

    January 26, 2012 at 2:16 pm

    • Yes that is one option to avoid this problem, but in order to do this you’d need to add new columns to your date dimension table every time you wanted to add a new hierarchy, which isn’t great.

      In any case, the real point of the post was to show the behaviour of m2m relationships with partitions so I didn’t discuss workarounds for this specific scenario.

      Chris Webb

      January 26, 2012 at 2:20 pm

      • Makes Sense…Thanks for the post..it is indeed helpful !!!

        Parikshit

        January 26, 2012 at 2:42 pm

  2. As the customer concerned – thanks for spotting this before we went any further with it.

    I can confirm that we reverted to a single Date Dim with many columns. We only foresee using 3 or 4 calendars max so not too traumatic. We are making use of DisplayFolder both as a visual tidy and a grouping ID for our applications but attribute names are still untidy, eg. YrWk – Cal 1, YrWk – Cal 2. Partitioning by some other dimension does not tie in with the ETL in our warehouse.

    Harvey

    February 2, 2012 at 3:40 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,220 other followers

%d bloggers like this: