Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Processing Dimensions By Attribute and By Table

with 3 comments

It was recently pointed out to me (by Hugo Lopes and Paulo Faria of Enabler in Portugal, who I had the pleasure of working with recently) that setting the ProcessingGroup property of a dimension can have quite a big impact on the amount of time it takes to process a dimension. By default this property is set to By Attribute, which means that each attribute in the dimension is processed separately using its own SQL Select statement which returns the distinct values for each member on the attribute. Here’s an example query used to process an attribute based on the Adventure Works Product dimension:

SELECT
DISTINCT
[dbo_DimProduct].[Color] AS [dbo_DimProductColor0_0]
FROM [dbo].[DimProduct] AS [dbo_DimProduct]

If you set it to By Table, however, in some circumstances (when your dimension is built from a single table?) only one SQL Select statement per table in the dimension structure is issued, reading all the data in, with AS presumably finding the distinct values itself.

When do you use which setting? I suppose it depends on a lot of things (performance of your data source, size and structure of your dimension, whether you’re processing other objects at the same time) so it’s going to be one of those things you’ll have to experiment with. Obviously most dimensions process extremely quickly anyway, but if you’ve got a large dimension which you need to process as quickly as possible this might come in handy.

Written by Chris Webb

November 21, 2006 at 9:44 pm

Posted in Uncategorized

3 Responses

Subscribe to comments with RSS.

  1. Ive noticed that for new dimensions the default seems to be \’by attribute\’ but for dimensions converted from AS2000 using the upgrade wizard the setting seems to be \’by table\’.
     
    The side effect of this became visible in one large dimension (4 million rows) with many attributes 30+ which fails to process when set to \’by table\’.  Some time into the processing it suffers a write error in one of the temp files its processing and judging by the file growth of these it may be hitting an OS limition on file size.
     
    Processing these large dimensions with many attributes \’by attribute\’ does not appear to suffer this same limitation.
     
     

    David

    November 23, 2006 at 3:42 am

  2. I assume this is on a 64-bit box? Interesting…

    Chris

    November 23, 2006 at 5:53 pm

  3. Trying to resurrect an old post, but be careful when you set your processing by table if your dimension is build using a snowflake schema. Your processed dimension will not have any missing members, i.e. the table query will have inner joins only.
    Processing by attributes is essentially a logical equivalent to left outer joins in a table query.

    Pasha

    April 13, 2012 at 2:59 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,868 other followers

%d bloggers like this: