Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Things I’d like to see in Analysis Services 2011

with 19 comments

It’s that time in the release cycle. I know what the new features are going to be for Analysis Services (for a full list see here) and it’s rather like Boxing Day: I’ve just got my presents and I’m starting to wonder to what Santa will bring on his next visit. And to a certain extent I’m feeling as though I should have been a bit more specific when I was writing my Christmas list last time, given that I’m a teeny bit underwhelmed by the features I did get in AS2008. So here, for the benefit of the elves working away in Redmond, is a list of things I’d love to see in Analysis Services 2011 or whatever it will be called:

  • Auto-partitioning
    Building and maintaining partitions is a job that almost everyone needs to do and yet it’s something that involves way too much hard work at the moment. As such it’s an ideal candidate for some form of automation, and indeed a while back Jamie Thomson posted the following suggestion on Connect: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=127378. Other possibilities include the use of data mining to determine how you should set the slices for your partitions (which I blogged about a long time ago) and something like the MDX driven auto-partitioning that Greg Galloway implemented for the Analysis Services Stored Procedure Project.
  • Do away with the arbitrary shaped sets error
    This is a particular bugbear of mine. The more I work on cubes with complex calculations, the more I find that I want to scope calculations on arbitrary shaped sets, which of course I’m not allowed to do. That means I end up making several identical assignments to get the same result, which leads to way too much duplicate code and a maintenance nightmare. If I can break up an arbitrary shaped set into something acceptable to Analysis Services, why can’t Analysis Services do this itself? Here’s the Connect I logged on this: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=339861
  • Fix parent-child hierarchies
    Parent-child hierarchies are a pain. For a start they cause performance problems so what I would want to see is the ability for AS to create aggregations at intermediate levels in a parent-child hierarchy at the very least; I’m sure there are a lot of other potential engine improvements that could be made for them. Secondly, scoping calculations on them is a real nightmare and on a related note trying to use dimension security with them is an equally thankless task. Given that there are some situations where you have no option but to use them, I think some time needs to be invested in making them better.
  • Build parallelism into the calculation engine
    What with all the block computation improvements in AS2008, I’m finding that many calculations are performing a lot better than in AS2005. However I’m still finding scenarios where the calculation engine is the bottleneck and just one CPU is being used on the server when I’m running a query (something I discussed here). Surely it’s possible to get the calculation engine to do some calculations in parallel when a query is being run?
  • Be able to rename attributes in role-playing dimensions
    If there was one feature I wanted to see in AS2008 it was this. It would have been so easy to do, I know loads of other people wanted it, and I did log it in Connect ages ago: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=144500. And it wasn’t included. All I want to do is to be able to rename the individual attributes of a role-playing dimension instead of just being able to change the dimension name – for example if I have Order Date and Ship Date role-playing dimensions based on a single Date dimension, then I want my ‘Calendar Date’ hierarchy in Date to appear as ‘Calendar Order Date’ and ‘Calendar Ship Date’ to my end users. Not being able to do this has stopped me using role-playing dimensions on so many occasions simply because seeing multiple hierarchies with the same name is too confusing.
  • Support for cross-dimensional user hierarchies
    Sometimes I find myself in the situation where I’ve got a large dimension, such as a Customer dimension, where the lower-level attributes are queried much less often than the higher level attributes. Now AS2005 handles large dimensions pretty well, in my experience, but it would still be useful to be able to store less useful lower-level attributes in ROLAP mode and more useful higher-level attributes in MOLAP mode. One solution to this could be the ability to set the storage mode on individual attributes rather than the whole dimension but I’m not sure that’s practical; instead it would be great if you could split the attributes up into a ROLAP dimension and a MOLAP materialised reference dimension and be able to build a user hierarchy that spans both. So you’d be able to drill down from Country to City to Post Code to Customer, and when you were querying the first three levels you’d be hitting a MOLAP dimension and when you queried Customer you’d be hitting a different ROLAP dimension. In fact I’m sure the ability to set up cross-dimensional drill paths would be useful in other scenarios too and make cubes much easier to use.
  • Support for partitioned dimensions
    Dimensions are getting bigger and bigger, and I wonder if at some stage it would be useful to be able to partition dimensions as well as measure groups to get better performance and manageability? Just a thought.
  • Be able to manage unary operators and custom member formulas from the MDX Script
    Perhaps this is just me, but sometimes I find it’s a bit of a pain when you’re using unary operators and custom member formulas that their values have to be stored in the dimension table. Can’t we have the option to be able to set or override a unary operator or custom member formula from the MDX Script – after all, that’s meant to the one central repository for all my calculations. I wonder if this is possible anyway with an ALTER CUBE statement? Hmm, I don’t think so.
  • New features for calculated members
    As I’ve mentioned here before, quite often I see people creating real members in dimensions then overwriting their values with scoped assignments when I’m sure they would be better off with using calculated members. So to meet that and other challenges, here’s a list of things it would be cool to allow with calculated members:
    • Allow calculated members to have other calculated members as children
    • Be able to control where calculated members appear in a hierarchy
    • Allow calculated members to have member properties, either returning static values or the results of MDX expressions
    • Following on from that, if calculated members can have calculated member property values, real members should have that too. For example, on a Customer dimension I might know a date of birth but I might also want to be able to calculate a customer’s age and show that as a property
    • Following on from that, it would be great to be able to create entire hierarchies based on calculated members or properties. So if I can calculate a customer’s age, and then have another calculated property that placed that customer into a bucket based on their age (eg 0-9, 10-19, 20-29 and so on), I’d also like to be able to determine the number of buckets based on some MDX as well (maybe using server parameters – see below) and create a user hierarchy where the top level would consist of calculated members representing these buckets and where I could drill down from these buckets to the individual ages and then down to the real customer members themselves.
  • Server parameters for MDX calculations
    The ability to parameterise an MDX query is cool, but it would also be great if you could also parameterise calculations on the server. For example you could set up a kind of server-wide variable which was the rate of Income Tax, do calculations using that value and then allow users to have the option of changing its value and explore ‘what-if’ scenarios. This would clearly have an impact on AS’s ability to cache calculations but so long as people knew about this then I’m sure it would be ok. These parameters would also be visible through the metadata so client tools would know to show users they could change them.
  • Be able to define either whole queries or ‘selectable chunks’ of MDX on the server
    After my posting on how dynamic sets actually work in AS2005 a few months back, I got to thinking about why I was disappointed and what I really wanted to see. The problem is that however well you model your cube, and however easy-to-use your client tool is, there’s always a gap between what the tools are capable of and what the user is able to accomplish with them. For example, you might know that your end users want to be able to create reports showing the top 10 customers with an extra value underneath that represents the aggregated values of all other customers outside the top ten. This is certainly possible in MDX and perhaps there are client tools out there that support this, but wouldn’t it be cool if you could create this as a kind of pre-canned selection that was parameterisable (eg show the top 20 instead of the top 10, or show products not customers) and visible through the metadata so any client tool could display it to the end user and allow it to be used? This would be a selection that would be designed to work wherever the user placed it in his or her query; it’s not the same thing as a dynamic set, more of a parameterisable custom set function. The concept could be extended from pre-canned selections to entire queries too – and if I haven’t convinced you (or been entirely successful in explaining) the concept of pre-canned selections then I think the case for entire queries stored on the server is much easier to make. Again these would be parameterisable and discoverable through metadata, so that any client tool would (hopefully) be able to use them; for Reporting Services alone this would be extremely useful, so you could have one query displayed in many reports with a single point of maintenance. I’m sure there would also be an opportunity to introduce some performance benefits for these stored queries too, perhaps in terms of being able to cache entire cellsets rather than just the values of individual cells.
  • Better management tools for security
    As I mentioned before here, the tools we’ve got at the moment are not up to the task of managing large numbers of security roles and some serious work needs to be done in the UI. It would also be great if the tools supported dynamic security and the creation of all the supporting objects, perhaps through a wizard.
  • Better support for Analysis Services in SQL Management Studio
    This is an easy one to specify: I want all of the functionality in MDX Studio put into SQL Management Studio. And a hundred other little things, like in the metadata pane I want to be able to open a level and expand a member to drill down as well instead of always having to start at the All Member of a hierarchy; oh, and I’d like to have a Reconnect button for when I’ve got a query and I’ve killed my session because of cube changes, instead of having to disconnect and reconnect. And the ability to display…
  • Query execution plans
    A lot of people have been asking for this already; here’s the Connect: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=321161. This would be soooo useful in performance tuning.
  • Support for other RDBMSs
    Analysis Services is useful not only as a database engine in its own right, but as a query/calculation layer on top of other database engines. OK no-one uses ROLAP on SQL Server because performance is rubbish compared to MOLAP, but for years people have been experimenting with using AS in ROLAP mode on top of Teradata (see here) and I’m sure that it could be used in a similar way on top of other data warehouse appliances or the new generation of COP databases like Vertica. You’d be selling it as a way of OLAP-ifying these databases, getting the incredible scalability and performance they (allegedly) offer in combination with the power of MDX for calculations and access to the data via Excel. Writing new cartridges for these databases should be pretty easy; I suppose the problem comes with trying to generate SQL optimised for each of these platforms.
  • Writeback with no allocation
    MOLAP writeback in AS2008 certainly does improve performance, but the need always to allocate your values down to the granularity of the fact table is always going to lead to a performance hit. Wouldn’t it be useful if you could write a value back to a cell without having to allocate it down always? You’d get instant writeback. I’m sure that would be useful in many cases, and in fact it would work in the same way as if you’d assigned a value to that cell in the MDX Script. I wrote some code in the Analysis Services Stored Procedure Project that did this, in fact, although it wasn’t really anything more than a proof-of-concept exercise.

Anyway, this post has gone on quite long enough now, although I’m sure if I thought about it I could go on for even longer. I need to do some work! But what would you like to see in Analysis Services 2011? Answers in a comment, please…

Written by Chris Webb

April 30, 2008 at 4:33 pm

Posted in Analysis Services

19 Responses

Subscribe to comments with RSS.

  1. Related to your "cross-dimensional user hierarchies" (which is a good suggestion), I wish you could choose to make certain high cardinality member properties ROLAP to save cube processing time. It seems like the ROLAP SQL it would have to generate to retrieve member properties is cheap compared to making the key attribute of a dimension ROLAP (which will often cause the entire dimension to be processed at runtime). Anyway, here\’s my connect item on that:

    https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=307664

     
    Also, related to your "new calculated member features" comment, I wish you could flip a flag on a calculated member and have assignments aggregate. I often create physical measures based on a null DSV calculated column, then do assignments in the calc script to that physical measure. It\’s unfortunate that that requires physical storage.

    Greg

    April 30, 2008 at 4:54 pm

  2. How about fixing the whole subcube/current context mess?

    Kevin

    April 30, 2008 at 5:54 pm

  3. Well, although I suspect some people would agree with you that it needs fixing, I\’m not so sure. I\’m actually quite happy with the way things work now, post SP2, and I\’m not sure I do want subcubes to affect currentmember.

    Chris

    April 30, 2008 at 9:12 pm

  4. Add a vote for the new calculated member features. 
     
    Not being able to define buckets of values and the lack of support for hierarchies fo calculated members has resulted in many convoluted solutions being adopted.
     
    All of the other suggestions would certainly make SSAS a more flexible and robust product.

    Philip

    May 1, 2008 at 8:57 am

  5. Great suggestions, Chris.
    Adding a few…
     
    – Support for ROALP and MOLAP Attributes within a single dimension
     
    – Calculated Member
                — Dynamic Names (Expression driven)
                — Short Name, Long Name property (Display name)
                — Key property
     
    – default set per Hierarchy (just like a default member)
    – default dimension per cube (for Proclarity)
     
    – Display Folders for Dimensions

    Harsh

    May 1, 2008 at 9:56 am

  6. Actually, the short description/long description thing has come up with me a few times, usually from customers who are used to other OLAP tools that have this functionality. It would be a good thing to see on real members as well as calculated members.
     

    Chris

    May 1, 2008 at 2:46 pm

  7. Nice list, Chris.
     
    My suggestions:
     
    – A Divide By Zero setting that would avoid the need to test for that condition in each measure.
     
    – A simpler way to test for nonexistent time values. It\’s a pain to have to test to see if a time exists before you can ask for the measure value (as in change measures).
     
    – Have a Display Order property for measures. Here\’s my Connect entry on this: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=294907
     
    – For calculated members, a property to see the contents of it. I wrap time sets in calculated members and need to refer to the set to handle measures calculations differently (e.g., Avg instead of Sum over weeks in set).
     
    David
     

    David

    May 1, 2008 at 5:32 pm

  8. I agree that it is a great list f improvements and since we are writing an early wish list to SSAS Santa I can add the following:
     
    Recursive hierarchies: Fix the bugs and improvements that Chris mentions but also add a new BI-wizard that can flatten out a recursive hierarchy to a natural hierarchy and add dummy members for missing levels.
     
    Calculated members: If you create a calculated member from one or several dimensions members the performance should be the same when you slice on the member and the calculated member.
     
    DSV: You should be able to script the named queries and named calculations to TSQL DDL:s as a start. You should also be able to use a DVS between different cube projects.
     
    Autoexist: I would like to extend this, as an option, between dimensions and fact tables in order to choose between a cross join between dimension and facts(like today) or  an inner join. This will be a true UDM.
     
    Measures. I would like to have member properties for measures so that I can add definitions for measures and calculated members in the cube.

    Thomas

    May 1, 2008 at 5:38 pm

  9. David – some great points there.
    – Division by zero – I agree, someone told me that in TM1 there\’s a special division operator that handles division by zero automaticaly and that would be great to have in MDX. It would improve readability and make code more concise.
    – Contents of a calculated member. Again, definitely very useful. I blogged about something similar here:

    http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!115.entry

    http://cwebbbi.spaces.live.com/Blog/cns!7B84B0F2C239489A!124.entry

    …and in fact, rereading those entries I see you commented on them back then!

    Chris

    May 1, 2008 at 9:56 pm

  10. I wish I could use roles on perspectives ( some time I need to show difrent data to the same persom dipending on his role play…)
    I with I could have more control over the cash, I would make alot of customers happer if I could load the daily partition to memoryby default
    I wish the report module will be the same as BO or cognus
    I with that there were a normal tool to explore the cube (better then panorama or ProClarity)

    eldad

    May 5, 2008 at 10:32 pm

  11. Just remembered, on the subject of security the issue I described here:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!1563.entry
    where you get MDX Script failures when using dimension security needs to be dealt with. I don\’t like setting ScriptErrorHandlingMode to IgnoreAll for the obvious reason that I might not see errors I did want to see; neither do I like the workaround of creating named sets and then using them in SCOPE statements. AS should be clever enough to work out which errors are real errors, and which ones are the result of security being applied.
     
    Another security point: why isn\’t it easier to deny access to a whole dimension in a role?

    https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=233410

     
    And while we\’re on the subject of ScriptErrorHandlingMode, see:https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=144248
    Am I the only person who reprocesses a cube where the data has changed and, a couple of hours later after the main processing has completed, gets incredibly frustrated when the overall process fails because of an MDX Script error? It would be great if, at the end of processing in SQLMS or VS, if processing failed with a Script error you had the option of ignoring this and using an empty Script so you could go on and fix the bug in your own time? I guess the point I\’m making is that we need greater control here, either ignoring all errors or ignoring no errors is not satisfactory.

    Chris

    May 6, 2008 at 10:10 am

  12. Hi Crhis,
     
    I\’m unable to find out your sample code about the Writeback allocation in the AS Storered Procedure Project. What\’s the name of the sp that do that?
     
    Thank you in advance
     
    Riccardo

    Riccardo

    May 10, 2008 at 9:37 pm

  13. It\’s in the file WritebackWithAssignments.cs
     

    Chris

    May 12, 2008 at 10:15 pm

  14. Overhaul MDXNo kidding.  I do not want to spend hours looking for a reason why averages are wrong because I wroteAvg ([Product].[Product].MEMBERS, [Measures].[sale])instead ofAvg ([Product].[Product].[Product].MEMBERS,  [Measures].[sale])I am sure that MDX can be made more readable.  Maybe by getting rid of the dotted notation ?  By adding more text ?Dimension(Product).Hierarchy(Product).Level(Product).Members is easier to read, and my brain can spend less time pre-processing but can actually try to understand the MDX itself…

    Tom VdP

    May 20, 2008 at 3:56 pm

  15. I agree with many of the suggestions in the post and the comments.  Especially "Do away with the arbitrary shaped sets error".  I also think that scope statements in MDX script should allow calculated members – i.e. do away with the "a set has been encountered that cannot contain calculated members” error.  This would vastly improve readaility and maintainability.  I don\’t want to specify my calculated members (e.g. a bunch of calculated measures) over and over again in my 30 cell assignments; and I don\’t want to copy and paste all my cell assignments into 30 separate Scope statments.  As Chris said, if I can break down the scope statements into something that is acceptable to Analysis Services, why can\’t Analysis Services do this itself?
     
    Another item for the list is here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=345884 "A linked dimension cannot be a reference dimension on top of another reference dimension".  The primary use case for linked cubes is to break down complex cubes.  In our case, this would allow commonality and reuse across projects in a large organization.  But we can\’t use them because, guess what?, complex cubes are the primary use case for reference dimensions.
     

    Christian

    May 24, 2008 at 3:06 am

  16. Now I\’ve started I can\’t stop.
     
    Analysis Services can\’t leverage a fraction of the maintenance features that SQL Server relational databases can.  Imagine what we could do with a schema diff generation tool like that provided by Datadude (but for cubes of course).  Here are a couple of use cases …
    – Enable common schema elements in distributed cubes across large organizations.
    – Incremental deployments of new cube objects where schema diff scripts are associated with a release.
     

    Christian

    May 24, 2008 at 4:02 am

  17. Here is the Connect item for my previous comment: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=347255

     
    Here is another item: I’d like to use MDX script cell assignments to populate custom properties.

     
    Similarly to
        FORMAT_STRING( This ) = "#,#";

     
    I’d like to do something like
        MY_PROPERTY( This ) = "MDX Expression";

     
    I would of course be able to access the assigned value for the cell through an ADOMD.NET.

     
    This would allow, for example, a UI allowing custom drillthrough.  A possible implementation might have a comma delimited list of Sales Order IDs for sales within the given period / customer slicing for the cell and when the user clicks on a cell, they are presented with a list of descriptions for those Sales Orders.

     
    And here is the Connect item for this one: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=350555
     

    Christian

    June 12, 2008 at 12:13 am

  18. Another thing: fix solve order. The current situation is a mess!
     
    First, I think the basic assumption that the order of calculations in the script shuld dictate solve order is plain wrong. It\’s ok as a default but there are lots of situations I\’ve found where you want to group calculations together because they logically belong together for maintenance purposes, but where the calculations have different solve orders.
     
    Second, the whole issue of calculations in the WITH clause having lower solve orders except when the Aggregate function is used or SCOPE_ISOLATION=CUBE is set is a hack, and way too confusing.
     
    Why can\’t we go back to numbered solve orders?

    Chris

    June 25, 2008 at 7:11 am

  19. This is an obscure one – sometimes, especially when I\’m working with many-to-many relationships, it would be handy to have role-playing measure groups. That\’s to say that I could have the same measure group appear twice in my cube with different dimension relationships, but only process it once.

    Chris

    June 26, 2008 at 8:45 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,144 other followers

%d bloggers like this: