Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

What happens when you do a Process Update on a dimension?

with 13 comments

Over the last few days I’ve been involved in an interesting thread on the SSAS forum regarding what happens when you do a Process Update on a dimension. It’s a topic that is not widely understood, and indeed I’ve not known all the details until today, but it’s nonetheless very important: one of the commonest performance-related problems I see in my consultancy work is partitions that have aggregations designed for them, but where those aggregations aren’t in a processed state because a Process Update has been run on one or more dimensions. Anyway, just now Akshai Mirchandani from the dev team posted a really good overview of what actually happens when you run a Process Update on that thread, so I thought I’d copy here to ensure it gets a wider audience:

Here is a quick summary of what happens when you do ProcessUpdate:

1. After the dimension has been updated, the server analyzes the changes that occurred to the dimension. In 2005, this analysis was pretty simple and would often incorrectly detect that major changes had occurred that required clearing of indexes and aggregations. In 2008, this code was improved such that it more often would realize that nothing significant has occurred. It’s a fairly small (but useful) optimization — I guess nobody thought it was worth documenting!

2. Based on this analysis, the server will decide whether or not indexes and aggregations need to be cleared. If no (e.g. because records were only added and not deleted/updated), then the partitions won’t be affected.

3. If indexes/aggregations need to be cleared, then the server will check if ProcessAffectedObjects was enabled — if yes, then instead of clearing the indexes/aggregations it will rebuild the indexes/aggregations.

4. The act of clearing the indexes/aggregations also shows up as "partition processing operations" in Profiler — that’s one of the things that has been confusing some of you.

5. When aggregations are cleared, only the flexible aggregations need to be cleared because we’re guaranteed by the rigid relationships that the members cannot have moved and therefore the rollups cannot have changed. However, indexes can still have changed and therefore you may still see the partition processing jobs kick off to clear the indexes.

6. ProcessIndexes and ProcessClearIndexes take care of building both bitmap indexes (aka map) and aggregations — the context is that both aggregations and bitmap indexes are generically considered "indexes".

Really the main takeaway here is that if you ProcessUpdate a dimension, you should strongly consider either doing ProcessAffectedObjects or an explicit ProcessIndexes on the affected partitions so that bitmap indexes and flexible aggregations get rebuilt. The advantage of explicitly doing ProcessIndexes is that you can bring your cube online earlier and have the indexes/aggregations get processed more lazily in the background — a number of customers prefer to do that because their processing windows are too small to wait for the indexes to get processed.

Also related to this topic, I thought I’d also highlight a great post by Darren Gosbell where he shows how to find out if your aggregations are processed or not:

http://geekswithblogs.net/darrengosbell/archive/2008/12/02/ssas-are-my-aggregations-processed.aspx

Written by Chris Webb

May 12, 2010 at 10:11 pm

Posted in Analysis Services

13 Responses

Subscribe to comments with RSS.

  1. Hi I have deployed PPS2007 in a test lab. All went well and without issues. From the Administration Console I configured everything without issues. However, when I went to Excel and tried to use the Report Wizard to create a report, I am asked to select the required model which I do without issues. Then I click Next, where I am asked to select the dimension and Hierarchy which I select OK. Then when I click the memeber selection field I get an error as follows: "Error getting the database context for the model site[2035d974-eeea-4143-9231-7d76089fb68f]can someone help please?regards

    Hassan

    May 13, 2010 at 8:31 am

  2. Hi Hassan,This isn\’t really the right place for this question. Try posting on http://social.technet.microsoft.com/Forums/en-US/ppsmonitoringandanalytics/threadsChris

    Chris

    May 13, 2010 at 11:19 am

  3. Hi, Chris. I can not understand what will happen if ProcessUpdate mode selected when a fact table contains missed dimension members.

    Dmitriy Tataurov

    May 12, 2014 at 3:08 pm

  4. Would setting the processing mode to lazyaggregations for measure groups related to a dimension that gets the processupdate command be helpful?

    Chris Ross

    September 3, 2014 at 2:25 am

    • It would mean that there’s one less thing to have to do explicitly, but in general I don’t like the idea of lazy aggregations – I like to be in control of what’s happening and I don’t like the idea of unpredictable performance.

      Chris Webb

      September 3, 2014 at 8:20 am

  5. Can you run Process Cube Default to rebuild any of the aggregations or indexes that were dropped as a result of the dimension update process?

    sqlcoop

    November 12, 2014 at 10:41 pm

  6. Hi Chris,

    First of all, thank you for all your very helpfull articles! Yours as well as Marco Russo’s and many other’s have been life savers many many times for me.

    I have a question regarding specifically the subject of this post.

    I have a cube with 3 measure groups, all partitioned by a “snapshot” dimension.
    My fact data is versionned and so have EFFECTIVE_DATE and EXPIRY_DATE.
    Snapshots are set dates of effectivity for my fact data. This way the user can compare what was effective today (“CurrentSnapshot”) versus a week ago (“CurrentSnapshot-7″)

    The 3 measure groups are Facts for “Reservation Sales” (really an aggregated view of detailed sales), “Reservation Item Sales” (Sales for individual Reservation Items) and “Reservation Detailed Sales” (Sales for each reservation – no Reservation Item Breakdown – by type of sales: Airport Taxes, Commission, etc)

    Partitions on snapshots are made so that I have x amount of “Moving Snapshots”: CurrentSnapshot (today), CurrentSnapshot-1 (yesterday), CurrentSnapshot-7(last week) and so on.

    I also have “fixed” snapshots which contain fact data that is effective at the 1st of each month. It is concerning what is happening to these partitions that I have questions.

    The data in these fixed partitions never change and so we do not process them in our SSIS Script (based on one of your post on processing through Script Task and AMO :) )

    However, even if we do not process these partitions, they still get “processed” in some form or another. I’m guessing they have their indexes dropped since the time taken for each fixed partition is small (between 5 and 15 seconds). The thing is that we now have 22 fixed partitions x 3 measure groups making 66 partitions. If each of these take 10 seconds, we easily have over 10 minutes of processing on these partitions.

    Key members for each dimension related to these measure groups do not change and do not get deleted. However, it is very possible that another attribute in the dimension gets changed (All dimensions are SCD1).

    And my question :)
    Since the key member attributes do not change and do not get deleted, aren’t the fixed partitions supposed to stay exactly the same when we “ProcessUpdate” the cube’s dimensions? Are are the indexes being dropped (assuming that this is what’s happenning)? Or is it normal that ALL partitions get touched when a process update occurs on a dimension ?

    Tanks in advance for taking the time to reply! Let me know if you need more details

    David Laplante

    March 3, 2015 at 3:31 pm

    • I forgot to mention that the fixed partitions do not have aggregations defined.

      David Laplante

      March 3, 2015 at 5:26 pm

      • If there are no aggregations, then it must be the indexes of the attributes where data has changed that are being rebuilt. Some things to try to improve processing performance would be to set the RelationshipType property of your attribute relationships to Rigid where appropriate, setting the AttributeHierarchyOptimizedState to NotOptimized on attributes that won’t be used for querying often (or even better, setting AttributeHierarchyEnabled to false or even deleting the attributes), and everything Andrew Calvett recommends here:
        http://blog.calvett.co.uk/2013/02/09/the-anatomy-of-a-process-update-part-1/
        http://blog.calvett.co.uk/2013/02/21/the-anatomy-of-a-process-update-part-2/

        Chris Webb

        March 4, 2015 at 9:13 am

      • *Curiously, I could not reply to your reply so I am doing it here*

        Thanks! That helps alot!

        We have dimensions that come from data in MDS and I know that one user is changing some attributes in a few entities and eventually those are brought back to our dimensions so this makes a lot of sense to me now. I was Under the impression that indexes were only for leaf levels attributes and then any index for rollups were managed from inside the dimensions. I never really looked at all the internal files for SSAS. I see that I have a lot of .map files (81 000) so as I understand it now, I have a great potential of having to recalculate the indexes. Especially since the dimensions are SCD1.

        I have also come upon the “Delete Unused Indexes” feature from BIDS Helper that I never gave much attention to. I am running a SSSAS Query Trace for a week and then I will run it to see what that wizard suggests I set NotOptimized.

        Thanks again for replying so quickly!

        David Laplante

        March 4, 2015 at 5:43 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,867 other followers

%d bloggers like this: