Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for January 2012

Many-to-Many Relationships and Partition Slices

with 4 comments

As you’re probably aware one of the benefits of partitioning a measure group is that it can improve query performance: for example, if you have partitioned your measure group by year and you run a query that only requests data for a particular year, then SSAS should only query the partition that holds the data for the year you’ve requested. There are however several scenarios where SSAS will scan partitions you would not expect it to scan – the ones I knew about up to now are all listed in the “Unexpected Partition Scans” section in this excerpt from “Expert Cube Development” – and I’ve just come across another scenario where this happens.

I was working with a customer the other week that had implemented Dave Fackler’s approach to handling multiple calendars using many-to-many relationships, as detailed in these two blog posts:
http://davefackler.blogspot.com/2008/05/handling-multiple-calendars-with-m2m.html
http://davefackler.blogspot.com/2008/06/handling-multiple-calendars-with-m2m.html

Now there’s absolutely nothing wrong with the design that Dave describes here – it’s a classic application for many-to-many relationships – and there aren’t any better ways of meeting this requirement. However, what I noticed when looking in Profiler at the customer’s cube was that all of the queries I was writing were resulting in reads on all of the partitions in the main measure group, even when I expected only one partition should be hit, and this was slowing the queries down a lot.

You can recreate the same problem in the version of the Adventure Works that Dave posts on his blog. Take the following query that uses the regular Adventure Works Date dimension and does not use the m2m relationship:

select {[Measures].[Reseller Sales Amount]} on 0,
{[Date].[Calendar].[Month].&[2004]&[3]}
on 1
from [Adventure Works]

When run on a cold cache, in Profiler you can see that it only results in the Reseller Sales 2004 partition being hit in the Reseller Sales measure group:

image

This is exactly as you’d expect. However if you request the same month (March 2004) via the Calendar dimension, which is connected to the Reseller Sales measure group via a m2m relationship and the Date dimension, you see all the partitions on Reseller Sales are hit:

select {[Measures].[Reseller Sales Amount]} on 0,
{[Calendar].[Year-Quarter-Month].[Month].&[2000200409]}
on 1
from [Adventure Works]

image

Not so good. But it turns out that this is an unavoidable side-effect of using many-to-many relationships: a filter on a many-to-many dimension (in this case the Calendar dimension) does not result in a filter being applied on the intermediate dimension (the Date dimension) to the measure group (Reseller Sales). Greg Galloway came across this some time ago and filed an issue on Connect which confirms that this is by design.

Given that I doubt this behaviour will be changed any time soon, the takeaway is that when you’re planning your partitioning strategy you should think twice about partitioning using a dimension that is used as an intermediate dimension in a many-to-many relationship. Otherwise you will lose all of the performance benefits of partitioning when your queries use that many-to-many relationship…

Written by Chris Webb

January 24, 2012 at 8:46 pm

Posted in Analysis Services

SQLBits X Agenda Published

with one comment

The agenda for the Friday and Saturday of SQLBits X is now published, and you can see it here:
http://sqlbits.com/information/Agenda.aspx?20120330
http://sqlbits.com/information/Agenda.aspx?20120331

If you were wondering whether to come hopefully this will make up your mind – but given that almost 1000 SQL Server professionals from all round the world have already registered, so you’d better book fast if you want to be sure of a space. There are loads of SSAS-related sessions to choose from, including ones from Cathy Dumas, Marco Russo, Alberto Ferrari, John Tunnicliffe, Erika Bakse, Akshai Mirchandani, Stephan Stolze, Bob Duffy and, errr, me. Trust me, it’s going to be great!

Written by Chris Webb

January 17, 2012 at 8:38 pm

Posted in Events

Tagged with

PASS: Time to do a lot more than change the by-laws

with 7 comments

I read the following posts by Andy Warren and Steve Jones today, and before you hear what my thoughts on the subjects they discuss you should probably read them too:
http://www.sqlandy.com/index.php/2012/01/pass-time-to-change-the-by-laws/
http://voiceofthedba.wordpress.com/2012/01/13/ethics-and-power

(In the interests of full disclosure let me state that I know James Rowland-Jones pretty well because we’re both involved in running SQLBits, but the first I knew about his being appointed to a voting position on the PASS board were the above posts. What follows are my opinions alone, written without consultation with James or anyone else).

I completely understand why Andy and Steve are upset. I have an awful lot of respect for them and their views. From their – perfectly valid – perspective, what has happened is unfair and undemocratic. However I suspect there’s a reason for what has happened, and that reason can be found elsewhere on Andy’s blog:

For too long PASS (and SQLSaturday) have been US centric.

In my opinion the big problem with PASS is that on one hand it’s a self-described international organisation (number #3 on its list of current and future strategic objectives is to “Focus on International Growth and Consolidation”) but it is, in effect a North American user group with a North American focus, run by North Americans. As far as I know the vast majority the membership of PASS is in North America and therefore it’s not surprising that North Americans dominate the board: PASS members vote for candidates they know and can relate to, and who address their concerns. And that also, to me, explains why despite its good intentions PASS has had such problems with its international role: the only way it can be successful internationally will be if it has a genuinely international element in its leadership, but that’s clearly never going to happen while membership is so skewed towards North America. It’s a vicious circle.

You could argue that as PASS, and especially SQL Saturday, expands internationally it will attract more members from outside the US then more international candidates will make it onto the board through democratic means. Indeed, Rob Farley’s success in the recent elections could be a sign that this is happening. I certainly voted for Rob not only because I know, like and respect him but because of his stance on international issues. The problem is that even if the level of participation in PASS elections everywhere in the world reached the same level as it has reached in the US today (and even an optimist would accept that we’re still a long time away from this happening) the PASS board would still be dominated by North Americans simply because it’s the biggest single market for SQL Server. Moreover some countries and territories would never have any effective representation because they would be simply too small. The world isn’t divided into the US and ‘everywhere else’ but each SQL community has its own identity and its own challenges and deserves representation. Not every community is as lucky as Australia to have someone like Rob who is a native English speaker, has an impressive technical reputation and can spend the time and yes the money (those plane tickets to the PASS conference aren’t cheap) to become sufficiently well-known in the North American community to win a PASS election.

So why should, say, the UK SQL Server community get a voice on the board that is out of proportion with its relative size? That is what I’m arguing for and it might seem fundamentally undemocratic. However the challenge of effectively representing different, geographically distributed subcommunities inside a larger community is one that all large, diverse democracies have to deal with. Why should Rhode Island have the same number of seats in the US Senate as California? The answer is that this has to happen for the larger community to gain democratic legitimacy within the smaller subcommunities. If this doesn’t happen in PASS it risks disinterest, disengagement and at worse resentment in its non-US chapters – the smaller communities will quite rightly realise they are better off on their own because their interests are not being considered. Unfortunately I’ve already encountered a certain amount of resentment towards PASS for exactly this reason over the years I’ve been involved in the SQL community in Europe and I would hate to see that grow; it’s certainly the reason for the historic weakness of PASS in the UK.

The appointment of James and Raoul Illyés as international advisers to the PASS Board of Directors was clearly an attempt to address the problems of international engagement and break the vicious circle I describe above, and I’m sure the same motives were behind James’s appointment to a voting position. I strongly believe that this move was the only way that any traction on the international issue could be gained. But as I said I can understand why Steve and Andy are upset and that’s not acceptable either. What can be done? I suggest that if PASS is serious about becoming an international organisation then wholesale constitutional reform is necessary to ensure that international chapters have a voice and a vote. To use a US-friendly analogy, PASS needs its own Connecticut Compromise: alongside the board we need another, parallel body with equal representation for each chapter in the world. I would like to see discussion on how such reforms could take place as soon as possible, otherwise there’s a risk each SQL community in each country will end up going its own separate way and that won’t be good for anyone. Whatever our nationalities and whatever our cultural differences we all have SQL Server in common and working together will bring benefits to all of us.

Written by Chris Webb

January 14, 2012 at 11:36 pm

Posted in PASS

Aliasing Columns in DAX

leave a comment »

Creating a copy of a column with a new name is pretty simple in DAX: you can just use the AddColumns() function. For example if we take a model with the DimProductCategory table from Adventure Works in, we could create a copy of the EnglishProductCategoryName column like so:

evaluate
addcolumns(
DimProductCategory
, "Copy of Category Name"
, DimProductCategory[EnglishProductCategoryName])

image

However, in some calculations and queries I’ve been playing around with, this isn’t enough: I’ve not only needed to create a copy of the column but also to remove the original. So for example to crossjoin the DimProductCategory table with itself and get a cartesian product of all the possible combinations of Category name you can’t just do this:

evaluate
crossjoin(
summarize(
DimProductCategory
, DimProductCategory[EnglishProductCategoryName])
,
summarize(
DimProductCategory
, DimProductCategory[EnglishProductCategoryName])
)

Because you’ll get the following error:

 Function CROSSJOIN does not allow two columns with the same name ‘DimProductCategory’[EnglishProductCategoryName].

What you need to do is add the new column with AddColumns() and then use Summarize() to get a new table that only contains the values in this new column, like so:

evaluate
crossjoin(
summarize(
DimProductCategory
, DimProductCategory[EnglishProductCategoryName])
,
summarize(
addcolumns(
DimProductCategory
, "Copy of Category Name"
, DimProductCategory[EnglishProductCategoryName])
, [Copy of Category Name])
)

image

Written by Chris Webb

January 12, 2012 at 12:01 pm

Posted in DAX

The DAX DateAdd function and missing dates

with 10 comments

A little while ago I was working with a date column in a Tabular model and wanted to create a calculated column that returned a date 100 days after the original date. Without thinking too much about it, I started off by using the DAX DateAdd() function – after all, I wanted to add some days to a date so it seemed like the obvious function to use. You can recreate this scenario in Adventure Works by importing the DimDate table into a new model in SSDT (SQL Server Data Tools, not BIDS – don’t forget to use the new name!) and then adding a new calculated column with the following expression:

=DATEADD(DimDate[FullDateAlternateKey], 100, day)

image

What I noticed was that it only worked for some dates and not all; for example, in the screenshot above it works up to September 22nd and not afterwards. This threw me for a few minutes, and then I realised what the problem was. In fact, the reason is mentioned (though maybe not explained as well as it should be) in the documentation for the DateAdd function:

The result table includes only dates that exist in the dates column.

The reason why DateAdd was returning a value for September 22nd 2006 was that the value December 31st 2006 existed in the FullDateAlternateKey column; it was not returning anything for September 23rd 2006 because the date January 1st 2007 did not exist in FullDateAlternateKey.

In fact there was an even easier way to do what I wanted to do, since the DAX date type is really a decimal value where the integer portion is the number of days – I could just add 100 to the date, as follows:

=DimDate[FullDateAlternateKey]+100

image

So, not a great discovery and certainly nothing that wasn’t known about before, but I thought it was worth mentioning because I’m sure other people (as here, for example) will fall into the same trap as me.

Written by Chris Webb

January 6, 2012 at 12:05 pm

Posted in DAX

Follow

Get every new post delivered to your Inbox.

Join 3,082 other followers