Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for the ‘MDX’ Category

MDX Cell Properties Supported By Excel

with 5 comments

I was wondering the other day (as you do) which of the MDX cell properties Excel PivotTables actually support. This page has all the details on the cell properties that are available in an MDX query but most client tools don’t bother retrieving all of them and Excel is no different. Of course it retrieves the most important properties and it retrieves one or two others, but I couldn’t resist doing a bit of research to find out the exact situation with Excel 2013.

The first thing to note is that you can control which cell properties Excel retrieves for a given connection in the connection properties dialog, in the OLAP Server Formatting section:

image

If you can live with not getting all of the cell properties back from SSAS there are some scenarios where unchecking all of the boxes in the OLAP Server Formatting section can improve performance:

  • When you have a large number of databases and cubes on your server, and/or complex security, because a side effect of the way Excel retrieves properties is that it causes all cubes in all databases to be loaded, their MDX Scripts executed and security evaluated. See here for more details.
  • When your queries return a lot of data and your network is slow. See page 61 of the SQLCAT Guide to BI and Analytics for more details.

You can also quite easily see which cell properties Excel is retrieving by looking at the MDX queries it generates (thank you OLAP PivotTable Extensions). Here’s an example of a simple PivotTable query run on a connection which has all of the boxes checked in the OLAP Server Formatted section:

SELECT 

{[Measures].[D1],[Measures].[D2]} 

DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME 

ON COLUMNS  

FROM 

[Adventure Works DW2008] 

CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

 

In the cell properties clause of this query you can see the six properties returned. Here’s a breakdown of each of them.

VALUE and FORMAT_STRING

Excel doesn’t actually retrieve the FORMATTED_VALUE cell property, which gives you the measure value with formatting applied by SSAS. Instead, assuming you have the Number Format box checked in the Connection Properties dialog, it retrieves the VALUE property from SSAS (which contains the raw, unformatted measure value) and the FORMAT_STRING property (which contains the format string you defined on the server). It then tries to convert the format string into an Excel format for the PivotTable. Unfortunately it can’t always do the conversion successfully – I blogged about a problem with the Percent format some time ago and this is still a problem with Excel 2013. Excel also doesn’t support formats defined in the fourth section (see here for more details) of the FORMAT_STRING property for null values, and there are probably lots of other relatively obscure types of formatting it doesn’t support too.

LANGUAGE

The language property of a cell controls things like the currency symbol displayed when you are using the “Currency” built-in format string and the thousands and decimal separators used. Whether Excel returns the language property or not is also linked to the Number Format checkbox in the Connection Properties dialog. I strongly recommend that you do not use the “Currency” format string and the Language property if you are working with multiple currencies, for reasons I outlined here.

BACK_COLOR and FORE_COLOR

I’ve never particularly liked using the BACK_COLOR and FORE_COLOR properties on a cell to do traffic light-style reporting, to be honest, although I know some people love it. However I do use these properties a lot when debugging scoped assignments as seen here. BACK_COLOR is retrieved if you have the Fill Color box checked on the connection properties dialog; FORE_COLOR is retrieved if you have the Text Color box checked on the connection properties dialog.

FONT_FLAGS

Probably the only surprise of this whole exercise was the fact that Excel retrieved the FONT_FLAGS property if you have the Font Style box checked on the connection properties dialog; alas it doesn’t support FONT_SIZE or FONT_NAME. For example if you put the following calculated members on your cube:

CREATE MEMBER CURRENTCUBE.MEASURES.BOLD as "This is BOLD", FONT_FLAGS=1;

 

CREATE MEMBER CURRENTCUBE.MEASURES.ITALIC as "This is ITALIC", FONT_FLAGS=2;

 

CREATE MEMBER CURRENTCUBE.MEASURES.UNDERLINE as "This is UNDERLINE", FONT_FLAGS=4;

 

CREATE MEMBER CURRENTCUBE.MEASURES.STRIKEOUT as "This is STRIKEOUT", FONT_FLAGS=8;

If you drop them into a PivotTable, you will see the following returned:

image

Written by Chris Webb

April 25, 2014 at 11:02 am

Free Video On Subselects In MDX

leave a comment »

Yet more videos have been added to my online MDX training course including a new, free video on subselects in MDX that you can see here:

https://projectbotticelli.com/knowledge/what-is-a-subselect-mdx-video-tutorial?pk_campaign=tt2014cwb

clip_image002 

If you’d like to subscribe to my MDX course then you can get a 10% discount by using the code TECHNITRAIN2014 when registering. When you subscribe you also get access to the entire Project Botticelli course catalogue including videos on Power Pivot, DAX, SSAS data mining, Excel and much more, recorded by the likes of Marco Russo, Alberto Ferrari and Rafal Lukawiecki.

Don’t forget, if you are looking for classroom training in London for MDX, DAX, Analysis Services Multidimensional or Tabular, Power BI, Integration Services or the SQL Server engine check out http://www.technitrain.com/courses.php

Written by Chris Webb

April 7, 2014 at 9:46 am

Posted in MDX, Video Training

CREATE SESSION CUBE and disk usage

with 6 comments

You learn all kinds of interesting things on the forums! Recently I saw this post from Nils Rottgardt about a problem with the CREATE SESSION CUBE statement and SSAS Multidimensional:

http://social.msdn.microsoft.com/Forums/sqlserver/en-US/175fc61f-676e-4e3d-bed4-283f151641ec/create-session-cube-command-by-excel-grouping-creates-a-shadow-copy-on-the-server-disk-storage?forum=sqlanalysisservices

Here’s a related Connect:

https://connect.microsoft.com/SQLServer/feedback/details/822778/excel-grouping-create-session-cube-breaks-ssas-envirounment-because-of-phyically-copy-the-cube-data-for-every-pivot-table

Basically, when you run a CREATE SESSION CUBE statement SSAS creates a shadow copy of the cube on disk – and if you have a very large cube, with very large dimensions, this could use up a lot of storage space on the server. When you use the custom grouping functionality in a PivotTable Excel generates a CREATE SESSION CUBE statement behind the scenes, so it’s possible to cause a lot of mischief just by using custom grouping in Excel.

Incidentally, this isn’t a problem for SSAS Tabular because it doesn’t support the CREATE SESSION CUBE statement.

I always knew this particular functionality could be very slow and always warned people against using it, but I didn’t know about the disk space issue… if your users haven’t found out about custom grouping yet, I recommend you don’t tell them about it!

Written by Chris Webb

April 1, 2014 at 9:36 am

New Free Video On The MDX CurrentMember Function

leave a comment »

New modules are being added to my MDX online training course all the time, and now there’s another free video available: a short introduction to the CurrentMember function. You can view it here (registration required):
https://projectbotticelli.com/knowledge/what-is-mdx-current-member-function-video-tutorial?pk_campaign=tt2013cwb

clip_image002

The previous free video on MDX SELECT statements is also still available here, now with no registration required. If you’d like to subscribe to this course you can get a 10% discount by using the code TECHNITRAIN2014 when registering.

Don’t forget, if you are looking for classroom training in London for MDX, DAX, Analysis Services Multidimensional or Tabular, Power BI, Integration Services or the SQL Server engine check out http://www.technitrain.com/courses.php

Written by Chris Webb

February 6, 2014 at 11:02 am

Semi-Additive Measures, Unary Operators, Custom Rollups And The Calculate() Statement

leave a comment »

Last week I was doing some performance tuning on SSAS Multidimensional and saw something very strange: when the MDX Script of the cube was executed (as always happens after the cache has been cleared, or processing has taken place – you can monitor this via the Execute MDX Script Begin/End events in Profiler) the Calculate() statement was taking just over four seconds. I’d never seen this before so I asked the nice people on the SSAS dev team what was going on, and Akshai Mirchandani very kindly filled me in on the details.

There are two types of calculation on an SSAS cube: those explicitly defined in the MDX Script (ie those seen on the Calculations tab of the cube editor); and semi-additive measures, unary operators and custom rollups, which are defined in the model itself. This second type of calculation is added to the cube when the Calculate() statement fires, and the more of them there are the longer it takes SSAS to work out where they should be applied in the space of the cube. In my customer’s case there were several large (80000+ members) parent/child hierarchies with unary operators as well as a few semi-additive measures and so this was the reason why Calculate() was so slow. Up to now I had only known that Calculate() triggers the aggregation of data up through the cube, which is why if you delete it the cube seems to contain no data.

Written by Chris Webb

December 16, 2013 at 4:05 pm

My Online MDX Training Course Is Now Live!

with 2 comments

Recently I spent a few days with Rafal Lukawiecki recording videos of my MDX training course for Project Botticelli. I’m now pleased to announce that the first two videos (to add to the free video I recorded last year) from the course are now live with more to follow soon. You can find the main course page here; the two videos available as of today are:

The MDX SELECT Statement

This video covers writing simple MDX queries and looks at the SELECT clause, the FROM clause and the WHERE clause. The cool thing is that it’s free to view – although you will need to register on the site first.

clip_image001

 

MDX Calculated Members

This video covers the basics of creating calculated members in MDX in the WITH clause and using the CREATE MEMBER statement. It’s available to subscribers only.

clip_image001[7]

 

Apart from my course there’s a lot of other great Microsoft BI video training available via Project Botticelli, including several DAX videos by my old friends Marco Russo and Alberto Ferrari. Subscriptions to the site are very reasonably priced, but if you register before the end of December 2013 you can get a 20% discount by using the following promotion code:

TECHNITRAIN20HOLS2013

Of course, if you prefer your training in a classroom, you can always attend one of my Technitrain courses in London next year.

Written by Chris Webb

December 4, 2013 at 10:08 pm

Caching The Rows Returned By An MDX Query

with 2 comments

Here’s another tip for those of you struggling with the performance of SSRS reports that run on top of an Analysis Services Multidimensional cube. Quite often, SSRS reports require quite complex set expressions to be used on the rows axis of an MDX query, and one of the weaknesses of SSAS is that while it can (usually) cache the values of cells returned by a query it can’t cache the structure of the cellset returned by the query. What does this mean exactly? Well, consider the following query:

SELECT

{[Measures].[Internet Sales Amount]} ON 0,

NONEMPTY(

GENERATE(

[Date].[Calendar].[Month].MEMBERS,

{[Date].[Calendar].CURRENTMEMBER}

*

HEAD(ORDER([Customer].[Customer].[Customer].MEMBERS,

[Measures].[Internet Sales Amount],

BDESC),2)

),

[Measures].[Internet Sales Amount])

ON 1

FROM [Adventure Works]

WHERE([Product].[Category].&[3])

Here I’m taking every month on the Calendar hierarchy of the Date dimension and finding the top two customers by Internet Sales Amount for each Month; notice also that I’m slicing the query by a Product Category. The results look like this:

image

On my laptop this query takes just over three seconds to run however many times you run it (and yes, I know there are other ways this query can be optimised, but let’s imagine this is a query that can’t be optimised). The reason it is consistently slow is because the vast majority of the time taken for the query is to evaluate the set used on rows – even when the Storage Engine has cached the values for Internet Sales Amount for all combinations of month and customer, it still takes the Formula Engine a long time to find the top two customers for each month. Unfortunately, once the set of rows has been found it is discarded, and the next time the query is run it has to be re-evaluated.

How can we improve this? SSAS can’t cache the results of a set used on an axis in a query, but SSAS can cache the result of a calculated measure and calculated measures can return strings, and these strings can contain representations of sets. Therefore, if you go into Visual Studio and add the following calculated measure onto the MDX Script of the cube on the Calculations tab of the Cube Editor:

CREATE MEMBER CURRENTCUBE.MEASURES.REPORTROWS AS

SETTOSTR(

NONEMPTY(

GENERATE(

[Date].[Calendar].[Month].MEMBERS,

{[Date].[Calendar].CURRENTMEMBER}

*

HEAD(ORDER([Customer].[Customer].[Customer].MEMBERS,

[Measures].[Internet Sales Amount],

BDESC),2)

),

[Measures].[Internet Sales Amount])

);

You can then use this calculated measure in your query as follows:

SELECT

{[Measures].[Internet Sales Amount]} ON 0,

STRTOSET(MEASURES.REPORTROWS)

ON 1

FROM [Adventure Works]

WHERE([Product].[Category].&[3])

Having done this, on my laptop the query is just as slow as before the first time it is run but on subsequent executions it returns almost instantly. This is because the first time the query is run the set expression used on rows is evaluated inside the calculated measure ReportRows and it is then turned into a string using the SetToStr() function; this string is then returned on the rows axis of the query and converted back to a set using the StrToSet() function. The second time the query is run the string returned by the ReportRows measure has already been cached by the Formula Engine, which explains why it is so fast.

Couldn’t I have used a static named set declared on the cube to do this instead? I could, if I knew that the Where clause of the query would never change, but if I wanted to change the slice and look at a different Product Category I would expect to see a different set of rows displayed. While in theory I could create one gigantic named set containing every set of rows that ever might need to be displayed and then display the appropriate subset based on what’s present in the Where clause, this set could take a very long time to evaluate and thus cause performance problems elsewhere. The beauty of the calculated measure approach is that if you change the Where clause the calculated measure will cache a new result for the new context.

There are some things to watch out for if you use this technique, however:

  • It relies on Formula Engine caching to work. That’s why I declared the calculated measure on the cube – it won’t work if the calculated measure is declared in the WITH clause. There are a lot of other things that you can do that will prevent the Formula Engine cache from working too, such as declaring any other calculated members in the WITH clause, using subselects in your query (unless you have SSAS 2012 SP1 CU4), using non-deterministic functions and so on.
  • Remember also that users who are members of different roles can’t share formula engine caches, so if you have a lot of roles then the effectiveness of this technique will be reduced.
  • There is a limit to the size of strings that SSAS calculated measures can return, and you may hit that limit if your set is large. In my opinion an SSRS report should never return more than a few hundred rows at most for the sake of usability, but I know that in the real world customers do love to run gigantic reports…
  • There is also a limit to the size of the Formula Engine flat cache (the cache that is being used here), which is 10% of the TotalMemoryLimit. I guess it is possible that if you run a lot of different queries you could hit this limit, and if you do then the flat cache is completely emptied.

Written by Chris Webb

November 4, 2013 at 9:30 am

Follow

Get every new post delivered to your Inbox.

Join 3,302 other followers