Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for May 2011

Session recordings from SQLBits 8 and TechEd

with 2 comments

The recordings of the sessions from the SQLBits 8 in Brighton are just starting to come online, so if you’re bored at work you may want to take a look at them here:
http://sqlbits.com/content/

We’ve now got over 200 hours of great SQL Server video content on the site, including lots of SSAS-related sessions. It is, of course, free to view and download with no registration required…

Similarly, the session recordings for TechEd are also available, and again there’s lots of interesting BI content there too. If you’re interested in learning more about the future of SSAS and BISM, the session on “What’s new in Denali for Analysis Services and PowerPivot” is a must-see.

Written by Chris Webb

May 27, 2011 at 12:44 pm

Posted in On the internet

Tagged with

Pyramid Analytics, XLCubed & Panorama Necto

with 5 comments

I’m always curious to see what’s new in the world of SSAS client tools, and quite frequently get demos of the latest client tools. Here’s a brief summary of three client tools I’ve looked at recently…

First of all, bioXL from Pyramid Analytics. It’s a very nice looking Silverlight cube browser with several very interesting features. However the main reason it’s worth looking at is that if you’re lumbered with a large Proclarity installation and no obvious way of migrating, it that it could be the answer to your prayers. It’s designed with existing Proclarity customers in mind: it’s almost completely backwards compatible with existing content stored in PAS, and equally importantly the UI follows the Proclarity look and feel very closely, so existing Proclarity users will feel very comfortable. In fact, looking at it you’d almost believe you were in a parallel universe where Microsoft hadn’t made that crazy decision to kill of Proclarity, and had instead rebuilt it in Silverlight.

Next up, XLCubed. Now I’ve blogged about them here before and Marco is also a fan, so I won’t say much, but I remain a big fan; version 6 has just been released and they’re working on mapping too. I think it’s one of the best tools on the market for the sophisticated SSAS user, both for Excel-based analysis and also for creating web dashboards.

Thirdly, Panorama Necto (see here as well), which aims to bring the benefits of social media to BI. The thinking here is that adoption of BI tools has stalled because the tools themselves are too difficult to use, and also that it’s too difficult to share and discuss the information found using these tools with a wider audience. Once you get past the fact that someone at Panorama really, really needs to read up on dashboard design (3D charts! Gauges! Arghhh, call Stephen Few!) before doing any more demos, I think they’re on to something. It’s still early days but I’ll be keeping an eye on how their functionality develops and integrates with different media.

Written by Chris Webb

May 25, 2011 at 10:27 pm

Replacing Linkmember with a Many-to-Many Relationship

with 11 comments

Linkmember is one of those MDX functions that I feel like I should never have to use – I always wonder whether I’ve modelled my data correctly when I do use it – even though it does come in very handy from time to time. There are two main problems with it:

  • Using it in calculations can cause performance problems, although its impact is much less with Analysis Services 2008 compared to what it was with 2005. My post on optimising the currency conversion calculations generated by the Add Business Intelligence wizard is a good example of this.
  • It’s not very flexible – it links members on one hierarchy to another hierarchy, whereas what you often want to be able to do (especially if you’re using role-playing dimensions) is to map the currentmember on any hierarchy of a dimension to the currentmember on the equivalent hierarchy on the other dimension.

Take the following example. Here’s a very simple cube using the Adventure Works database, with one measure group based on the Internet Sales fact table, one measure called Sales Amount and with three dimensions: Order Date, Due Date and Customer.

image

If you want to be able to compare the sales on a given Order Date with the sales on the same day on the Due Date dimension, you might write a query something like this:

WITH
MEMBER MEASURES.DUEDATESALES
AS
([Measures].[Sales Amount], [Order Date].[Date].[All Periods],
LINKMEMBER([Order Date].[Date].CURRENTMEMBER, [Due Date].[Date]))
SELECT
{MEASURES.DUEDATESALES,[Measures].[Sales Amount]} ON 0,
[Order Date].[Date].[Date].MEMBERS ON 1
FROM [LinkmemberTest]

image

It’s a bit contrived, I’ll admit, but you get the idea. Performance here isn’t much of a problem, but flexibility is: if you put Years on rows instead of Dates, you get meaningless results. So:

MEMBER MEASURES.DUEDATESALES
AS
([Measures].[Sales Amount], [Order Date].[Date].[All Periods],
LINKMEMBER([Order Date].[Date].CURRENTMEMBER, [Due Date].[Date]))
SELECT
{MEASURES.DUEDATESALES,[Measures].[Sales Amount]} ON 0,
[Order Date].[Calendar Year].[Calendar Year].MEMBERS ON 1
FROM [LinkmemberTest]

…gives us the sales for the Year by Order Date in both columns, because in the calculation the Linkmember function returns the All Member on the [Order Date].[Date] hierarchy:

image

Fortunately, we can get around this by replacing the use of Linkmember with a many-to-many relationship. For this particular example, we need to do the following:

1) Create a new utility dimension that will allow us to switch between showing Order Date values and Ship Date values. We can do that using a view or named query something like this:

SELECT        0 AS DateTypeKey, 'Order Date' AS DateTypeDesc
UNION ALL
SELECT        1 AS DateTypeKey, 'Due Date' AS DateTypeDesc

And from this building a dimension (which I’ve called Date Type) with a single hierarchy, where the IsAggregatable property is set to False.

2) Build a new bridge table to use as the intermediate measure group in our many-to-many relationship that gives us all the distinct combinations of Order Date and Due Date, and Due Date and Order Date, that exist in our fact table. I used the following SQL to do this:

SELECT DISTINCT 0 AS DateTypeKey, OrderDateKey AS DateKey, OrderDateKey, DueDateKey
FROM            dbo.FactInternetSales
UNION ALL
SELECT DISTINCT 1 AS DateTypeKey, DueDateKey AS DateKey, OrderDateKey, DueDateKey
FROM            dbo.FactInternetSales AS FactInternetSales_1

Luckily, in the Internet Sales Fact table there’s a 1:1 relationship between Order Dates and Due Dates, but that’s because the data’s fake.

3) Add a new measure group to the cube based on this fact table (I’ve called it Date Type Switch), add a new role-playing instance of the Date dimension (which I’ve just called Date) to the cube as well as the Date Type dimension created above, and give these dimensions as well as the Order Date and Due Date dimensions regular relationships to this new measure group. You can then create many-to-many relationships between the Date Type and Date dimensions and the Internet Sales measure group.

image

You can then use the Date dimension in your queries, and use the Date Type dimension to choose between showing Order Date values or Ship Date values. For example:

SELECT
[Date Type].[Date Type].[Date Type].MEMBERS
ON 0,
[Date].[Date].[Date].MEMBERS
ON 1
FROM [M2MTest]
WHERE([Measures].[Sales Amount])

image

And this time, we can use any hierarchy on the Date dimension and get the correct results:

SELECT
[Date Type].[Date Type].[Date Type].MEMBERS
ON 0,
[Date].[Calendar Year].[Calendar Year].MEMBERS
ON 1
FROM [M2MTest]
WHERE([Measures].[Sales Amount])

image

How does this work, then? You can think of the many-to-many relationship here working like a switch. in most cases when there are two dimensions between the intermediate measure group and the main measure group in a many-to-many relationship, the relationship is resolved through both dimensions. In this case, however, the relationship is only ever resolved through one dimension or the other and the Date Type dimension controls which of the dimensions is used. When the Due Date member on Date Type is selected, for example, whatever is selected on the Date dimension is related to the equivalent members on the Due Date dimension while on the Order Date dimension it’s as if you’re not selecting anything (or, strictly speaking, it’s as if you’re selecting all the members on each hierarchy that have data associated with the Due Dates you’ve selected).

In this particular case, because there’s a relatively small number of rows in the intermediate measure group, performance is marginally better than the Linkmember approach. However this is not always going to be the case – with a larger intermediate measure group you’ll find performance may be significantly slower than using Linkmember. Therefore, you’ll need to test thoroughly if you do decide to use this approach and your main reason for using it probably should be for the flexibility it provides.

Written by Chris Webb

May 24, 2011 at 11:31 pm

Posted in Analysis Services, MDX

Google Docs gets pivot tables

with one comment

Not much to say, really, apart from the fact that you can now use pivot tables in Google Docs:
http://googledocs.blogspot.com/2011/05/summarize-your-data-with-pivot-tables.html

Yet another small step closer to Google having something resembling a BI solution… integrate Fusion Tables, Public Data Explorer, BigQuery and this and we’ll have something interesting, although I’m not convinced Google has a coherent BI strategy or thinks it needs one.

Written by Chris Webb

May 18, 2011 at 6:51 am

Posted in Google

Good news on the future of Analysis Services

with 15 comments

If you read my blog, you’re probably aware of a post from last November I wrote about the future of Analysis Services while I was at the PASS Summit. It caused an awful lot of fuss and ever since I’ve not been able to go for more than 5 minutes at a SQL Server event without somebody stopping me and asking “So, Chris, is SSAS really dead?” (the answer is an emphatic NO, by the way). Anyway, as you may or may not have already seen, there’s been a new announcement on the future of Analysis Services at this year’s TechEd and an accompanying blog post from TK Anand here:
http://blogs.msdn.com/b/powerpivot/archive/2011/05/16/analysis-services-vision-amp-roadmap-update.aspx

…and as a result of my involvement in the public debate on this subject, I feel obliged to add my comments on it. I’ll do that by extracting and discussing the main points made in TK’s post, but before I start let me state clearly that this new announcement is extremely positive in my opinion and contains news that I am very glad to hear.

In the past six months, we have been talking to many people in the Microsoft BI community – customers, partners, developers, and MVPs – to get feedback on the roadmap.

The first thing I’d like to say is how much my respect has grown for the SSAS development team over the last six months, as a result of how they’ve handled the debate over the future of their product. There were some heated discussions going on in November but once everyone had calmed down, all of the misunderstandings had been cleared up and we had apologised to each other – and I certainly had much to apologise for, as much as anyone else – there were some very fruitful discussions about how to move forward. I don’t expect Microsoft do develop products to my precise specifications, and nor should they; back in November I thought I’d made it clear that there were good reasons for going down the BISM route and I understood why they were made. However today’s announcement does address all of the genuine (as opposed to the purely emotional) reasons I had for being upset. It is not a U-turn, more of a modification to what was presented before, but it is nonetheless significant and shows that the SSAS team have taken on board the feedback they’ve received.

The BI Semantic Model is one model for all end user experiences – reporting, analytics, scorecards, dashboards, and custom applications

So what is the new direction, then? Reading the quote above, those of you with long memories will remember the promise of the UDM back in SSAS 2005: it too was meant to be a single reporting model for all purposes. It didn’t work out that way, of course, despite the massive success of SSAS 2005; furthermore I suspect there will be plenty of people who read TK’s blog post and are still confused about the exact story because it’s quite complicated (again, I’m reminded of the confusion around what the UDM exactly was in the 2005 release). But the overall aim of BISM in Denali is to make good on this old promise of ‘one model to rule them all’ and I think Microsoft has a much, much better chance of succeeding this time.

In purely technical terms the story is almost, but not quite, the same as it was in November. The BI Semantic Model (BISM) is the new name for Analysis Services. In Denali when you develop with BISM you will have a choice of two types of project which represent two design experiences: use the tabular model to get all the advantages of a table-based approach, DAX queries and calculations, and Vertipaq storage; use the multidimensional model to get the rich functionality of the cubes we have today, MDX and MOLAP storage (I would urge you to read and reread TK’s post to get the full details on this). What has a new emphasis is that these are two complementary models that have their strengths and weaknesses and are appropriate in different circumstances; it’s not a case of old versus new or deprecated versus the future, they are two sides to the same coin. What’s more there’s now firm evidence that MS will be paying more than lip-service to this idea:

The multidimensional project lets model developers use the multidimensional modeling experience along with MDX and MOLAP/ROLAP (this is what existing UDM projects get upgraded to).  The tabular project lets model developers use the tabular modeling experience along with DAX and VertiPaq/DirectQuery.  It is important to note that these restrictions in the two projects are not rigid or permanent; they could very well change in future releases based on customer requirements.  For example, we could offer model developers VertiPaq as a storage option for multidimensional projects or MDX scripts for tabular projects. Another limitation in the upcoming CTP release is that models built using the multidimensional project will not support DAX queries (and thereby Crescent, which uses DAX to retrieve data from the model). We recognize that removing this restriction is very important for customers with existing Analysis Services solutions to be able to upgrade to SQL Server “Denali” and leverage Crescent. The product team is actively working on it and is committed to making this functionality available to customers.

This is the important section of the post, and it addresses my two main complaints from last year. The first is that the multidimensional way of modelling should not be jettisoned because it has inherent advantages for certain types of BI application, such as financial applications with complex calculations. Let me be clear: I’m not talking about the underlying technology here (I couldn’t care less if MOLAP storage disappeared tomorrow and was replaced by Vertipaq if it meant my queries ran faster), but the basic concept of modelling data as cubes, dimensions and hierarchies instead of as a series of tables and relationships. The tabular way of modelling data has many advantages of its own of course, and in the long run I think that for something like 30% of BI projects the tabular model will be the correct choice, for 10% the multidimensional model will be the correct choice and for the remaining 60% either will work just as well. The point is, though, that we will have a choice between two different but equally valuable ways of modelling data and that by committing to the delivery of new functionality in the multidimensional model in the future, choosing to use it today on a project no longer looks like a bet on a defunct technology.

My second complaint was that there was that there was no good migration roadmap for existing customers who have invested heavily in SSAS over the last few years, beyond telling them to redevelop their applications from scratch in the tabular model. As I said in the previous paragraph it’s now clear that there will be no need to redevelop using the tabular model in the long term if there’s no desire to do so because, even if there’s not much new for them in the Denali release, there will be new functionality coming at some point as the technologies underpinning the two models come closer together. For example, now we know that we will one day have DAX support for the multidimensional model, we know that people will be able to use Crescent on top of their existing cubes just by upgrading them and without completely redeveloping them. The cool new stuff will not be restricted to users of the tabular model.

Overall, then, I’m very happy with what’s been said. I’ve been working with Analysis Services since the very beginning, when it was OLAP Services, and I’m now getting ready to learn BISM and prepare for the future – a future that looks a lot brighter to me now. Bring on the next CTP of Denali!

Written by Chris Webb

May 17, 2011 at 10:19 pm

Posted in Analysis Services, BISM

PASS Summit Sessions–Vote for Me!

with 2 comments

I’m not usually one of those people who blog about the sessions they’ve submitted at conferences, or the feedback they get afterwards (why would you be interested in that?), but since this year PASS are letting prospective attendees vote on which sessions they’d like to appear at the Summit and everyone else is plugging the sessions they’ve submitted, I feel obliged to plug mine too. I’ve submitted two session abstracts: one on scoped assignments in MDX, and another on SSAS/BISM security; please vote for them here:
http://www.sqlpass.org/summit/2011/Speakers/SessionPreferencing.aspx?spid=306&p=1&preferred=False

I’ve really enjoyed speaking at the Summit for the last few years, and I hope to be there again this October…

Written by Chris Webb

May 16, 2011 at 10:24 pm

Posted in Events

Why not to use the external assembly approach for dynamic security

with 7 comments

If you’re familiar with dynamic security in SSAS, you probably know there are two basic ways of implementing it: the measure group approach, where security credentials are stored in a measure group inside the cube and the MDX set expression in your role queries this measure group to work out which members each user has access to; and the external assembly approach where the role uses a custom function MDX function from an external assembly to get the same information. Both approaches are covered in some detail in “Expert Cube Development” (still available in all good bookstores, BTW) but I’ve always preferred to use the measure group approach because a) I’m a rubbish .NET developer, and much more at ease modelling data into a measure group, and b) I’ve heard lots of complaints about opening a connection being really slow when you’re using the external assembly approach.

Anyway, to cut a long story short, I had the pleasure of meeting Gerhard Brückl (you may know him from the SSAS MSDN Forum) at SQLBits the other month and he had an interesting question for me: he’d been using the external assembly approach and found that once a user had connected to the cube once their security credentials were cached and the assembly wasn’t called again, even if the user disconnects and reconnects, until the cube’s cache was cleared. While this is good thing from a performance point of view, it has one big drawback as Gerhard pointed out: if the user’s security rights subsequently change the cube will not pick this up immediately. And of course this means that a user may have access to data they really shouldn’t have access to – which is a potentially big problem.

This initiated an interesting discussion with Akshai Mirchandani and various other SSAS obsessives about whether this could be worked around. The short answer is that it can’t be, at least not easily. You can execute an XMLA ClearCache on the whole cube but this might affect query performance quite significantly; you can reduce the impact by clearing the cache of just one measure group in your cube (you could even create a dummy measure group for this purpose) since this will clear the scope cache where the permissions are stored, without affecting the SE caches of the other measure groups. But you’re still left with the problem of knowing when to clear the cache: you’d need to write some code somewhere that checks to see if permissions have changed and runs a ClearCache if they have. As a result of all this, I’d say that if you’re using dynamic security, your permissions change regularly and you need the cube to reflect these changes as soon as possible, you should avoid using the external assembly approach altogether and use the measure group approach instead. 

Written by Chris Webb

May 9, 2011 at 10:02 pm

Posted in Analysis Services

Tagged with

Follow

Get every new post delivered to your Inbox.

Join 3,300 other followers