Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for January 2010

PASS European Conference and the Microsoft Architect Conference

leave a comment »

Two new speaking engagements to mention: firstly, I’m happy to announce I’ll be doing two sessions at the PASS European Conference, which is taking place on April 21st-23rd in Dusseldorf, on DAX and SSAS cache-warming. You can view the whole agenda here:
http://www.sqlpass.org/summit/eu2010/Agenda.aspx

I’m also going to be speaking on PowerPivot and how it fits into the wider corporate BI picture at the Microsoft Architect Insight Conference in London on the 31st of March. You can see the agenda for that event here:
http://msdn.microsoft.com/en-gb/architecture/ee959262.aspx

As always, say hello if you see me…

Written by Chris Webb

January 29, 2010 at 11:07 pm

Posted in Events

Connecting to SQL Azure from Excel 2007

with 8 comments

Sometimes I have an idea, spend a bit of time testing it out, and it ends up being a dead end. That’s what happened this evening but there’s at least one interesting bit of info that resulted so I thought I’d blog anyway…

My original thought was this:
* Excel 2007 can connect to SQL Server to retrieve data for use in reports in a worksheet
* SQL Azure is SQL Server in the cloud
* Office Web Apps gives us the ability to view Excel spreadsheets in the cloud
* So can I create an Excel spreadsheet that connects to SQL Azure, build a report using data from there, upload it to the Excel Web App and then refresh the connection so that my spreadsheet in the cloud displays live data from the cloud?

The short answer is no, at least not at the moment. But after a bit of trial-and-error I did get halfway there and manage to hook Excel 2007 up to SQL Azure (nb I’m not talking about using PowerPivot, which does work with SQL Azure, but the built-in Excel functionality). Here’s how:

  • None of the built-in functionality for connecting to SQL Server from Excel 2007 seems to work. However it is possible to connect to ODBC data sources from Excel and SQL Azure can be exposed as an ODBC data source.
  • So, in Excel, go to the Data tab and click on the “From Other Data Sources” and “From Data Connection Wizard”.
  • Select “Other/Advanced” and then the Microsoft OLE DB Provider for ODBC Data Sources and then click Next
  • Select the ‘Use Connection String’ option and paste the ODBC connection string that SQL Azure generates for you into the box.
  • Paste the value of the Uid property in the User name box, and put your password in the Password box. Delete the Uid and Pwd properties from the connection string.
  • Type the name of your database into the Initial Catalog box, then click OK
  • Finally a list of tables in your SQL Azure database appears; choose one, create an Excel data source and then create either a table or pivot table from the data. Click OK, enter your password one last time, and bingo!

Unfortunately, as I said, when I uploaded the resulting spreadsheet to the Excel Web App, I got the following error message:

image

Shame – I can understand why it makes sense for most external data connections not to be supported, but in this case, when the external data you’re connecting to is also in the cloud, it would be nice if an exception could be made.

One day, though, I’m sure a scenario like this will work. When I think about what Microsoft’s story for cloud BI might be like, the Excel Web App is the obvious candidate for the reporting tool. Whether you’re reporting direct from relational data stored in SQL Azure, or from some kind of cube (PowerPivot in the cloud is another obvious direction), Excel is going to be the easiest way to do it for the largest number of people. I do see a role for some kind of SSRS in the cloud too, but even in the Microsoft BI stack at the moment there’s a lot of overlap between SSRS and Excel/Excel Services for reporting; I wonder if this will be rationalised at some point? For example Report Builder has never really caught on as a way of letting end-users build their own reports, so why not forget it, develop Excel for this purpose and somehow extend SSRS’s rich functionality for managing and scheduling reports to work with Excel-based reports? Just a thought.

Written by Chris Webb

January 27, 2010 at 11:14 pm

Posted in Excel

Cell Security and Calculated Members Defined in the WITH Clause or Session

with 21 comments

I was asked an interesting question today about cell security – how can you get it to work with calculated members defined in the WITH clause or the session? If, for example, you create a role on the Adventure Works database with the following expression in the Read Permissions box for cell security:

[Geography].[Geography].currentmember is [Geography].[Geography].[Country].&[Australia]
AND
( [Measures].currentmember IS [Measures].[Reseller Sales Amount]
OR
[Measures].currentmember IS [Measures].[Reseller Order Count])

If you then run the following query when connecting via the role:

WITH
MEMBER MEASURES.TEST AS
[Measures].[Reseller Sales Amount] * [Measures].[Reseller Order Count]
SELECT
{MEASURES.TEST, [Measures].[Reseller Sales Amount], [Measures].[Reseller Order Count], [Measures].[Reseller Gross Profit]}
ON 0,
[Geography].[Geography].MEMBERS
ON 1
FROM [Adventure Works]

You’ll see the the following results:

image

This is pretty much what you’d expect – you only see values for Australia for Reseller Sales Amount and Reseller Order Count. The problem here is how you can grant access so the user can see the measure TEST which has been defined in the WITH clause. If you try the following expression in the role:

[Geography].[Geography].currentmember is [Geography].[Geography].[Country].&[Australia]
AND
( [Measures].currentmember IS [Measures].[Reseller Sales Amount] OR [Measures].currentmember IS [Measures].[Reseller Order Count] OR [Measures].currentmember IS [Measures].TEST)

You see the following error in a few cells in the query above:
#Error CellPermission (3, 162) The member ‘[TEST]‘ was not found in the cube when the string, [Measures].[TEST], was parsed.

And in any case, how can you know in advance what the name of the calculated measures you want to grant access to are going to be anyway?

What we actually want to do is retain control of all measures, calculated or otherwise, that are defined on the cube, but allow access to any calculated measures defined in the WITH clause or the session. The way to do it is to rely on the fact that cell security is evaluated after the MDX Script and do the following. First define a named set on the cube at the end of the MDX Script something like this:

CREATE SET CURRENTCUBE.ALLMEASURES AS MEASURES.ALLMEMBERS;

This set contains all of the measures, including calculated measures, that have been created on the MDX Script – except those that have their Visible property set to False (these you’d have to add to the set manually).

Then use an expression such as this in your role:

[Geography].[Geography].currentmember is [Geography].[Geography].[Country].&[Australia]
AND
( [Measures].currentmember IS [Measures].[Reseller Sales Amount] OR [Measures].currentmember IS [Measures].[Reseller Order Count] OR
Count(Intersect({[Measures].currentmember},{ALLMEASURES}))=0)

The last line is the new part: it checks to see whether the currentmember on Measures is included in the set we’ve just defined. If it is, we can be sure it’s defined on the cube. If it isn’t, it must be defined either in the WITH clause or in the session so we can grant access to it. If you then run the query again, you get the following results:

image

As you can see, we can now see the value for TEST. Of course you also need to be extremely careful you’re not using a security hole here by doing this, test thoroughly and use Read Contingent permissions as appropriate.

Written by Chris Webb

January 26, 2010 at 10:53 pm

Posted in Analysis Services

1 Day “Introduction to MDX Course”, Switzerland

leave a comment »

Continuing my travels, I can also announce now that I’ll be running a 1-day version of my “Introduction to MDX” course at Microsoft Switzerland’s offices in Wallisellen, near Zurich, on March 16th. Full details and registration can be found here:
http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032437369&culture=de-ch

Despite what it says at the top of the page, the event will be in English – sadly, my German isn’t quite good enough to explain the concepts of MDX yet. And it’ll be nice to go back to the MS offices in Wallisellen: I used to work there, you know…

Written by Chris Webb

January 20, 2010 at 8:50 pm

Posted in Events

PASS Germany (Rhein/Main Chapter) Meeting on PowerPivot

leave a comment »

I was just thinking the other day how long it’s been since I’ve done a user group meeting in the UK… the reason being, unfortunately, that I’m hardly ever at home except at weekends. Blame the rubbish state of the UK economy, the weak pound and the strong Euro. Anyway, I am pleased to announce that I’ll be speaking at PASS Germany’s Rhein/Main chapter on the 9th of February, doing a session called “Introduction to Self-Service BI with Office 2010 and PowerPivot”. More details can be found here:
http://www.sqlpass.de/Regionalgruppen/RheinMain/tabid/83/Default.aspx

Hope to see some of you…

Written by Chris Webb

January 20, 2010 at 8:38 pm

Posted in Events

Business Analysis Tool Desktop

with 3 comments

Continuing my occasional series of reviews of SSAS client tools, I recently took a quick look at Business Analysis Tool Desktop from BIT Impulse, a company based in the Ukraine. It’s aimed at the power-user market, which Proclarity desktop used to dominate (and still does, to be honest, despite the fact it died several years ago, but I won’t go off on that rant again), and offers advanced analysis functionality for people who find Excel pivot tables too basic and restrictive.

It makes a good first impression – a nice, modern UI, with a look-and-feel that will be very familiar to users of Proclarity and also Tableau. To start you need to create a ‘workbook’, which contains multiple ‘pages’, which can contain several different types of analysis.

The first page type can contain either a table, a table and a chart, or just a chart. Query building is accomplished by dragging and dropping hierarchies either onto the rows and columns of a pivot table, or onto a ‘shelf’ on the top of the pivot table, and this works very smoothly and intuitively. All of the advanced selection mechanisms you’d expect are present: you can select individual members, entire hierarchies or levels, descendants, and so on.

image

Complex filters of either the Rows or Columns axis, or specific hierarchies that have been selected, can also be built up using one or more conditions; similarly you can sort axes and hierarchies easily too, and do Office 2007-like cell highlighting to create heatmaps. Once the query has been executed, there’s a nice feature whereby you can hide some or all of the real-estate connected to query building such as the lists of dimensions and measures and the ‘shelves’; I also liked the way it was possible to resize rows and columns in the grid to make the layout clearer.

image

The other page types include a treemap:

image

…and a rather cool scatter graph that can be animated to display changes in data over time, and which I spent quite a bit of time playing with:

image

Overall, it’s certainly a strong competitor in its sector and worth checking out if you’re in the market for this type of tool – I liked it. I wouldn’t say it was miles better than any of the other tools like it that I’ve reviewed in the last year, but it’s definitely no worse and has its own particular strengths.

Written by Chris Webb

January 19, 2010 at 9:09 pm

Posted in Client Tools

Solving the ‘Events in Progress’ Problem in DAX, Part Two: Aggregating Dates and Multiselect

with 9 comments

You may recall that by the end of last week’s post on solving the ‘events in progress’ problem in DAX, I’d managed to get the right values out of PowerPivot at the Date level with good performance. BUT when, for example, I selected a Calendar Year I got no values out, and I hadn’t even thought of how to handle situations such as when Date was on filter and the user had selected three different, non-consecutive dates. Well, that was last week when I was a DAX newbie – this week, well, I’m slightly less of a newbie and I’ve got the solution.

First, let’s restate the problem a bit more clearly. What I want is this: when I have a date on columns or rows, I want my calculated measure to display the number of rows in the FactInternetSales table that have an OrderDate less than or equal to that date, and have a ShipDate that is greater than or equal to that date. This way I see the number of rows in the table where something has been ordered but not yet shipped. When I select a year, I want to see the number of rows where that OrderDate-ShipDate range intersects with the dates in that year. So if I selected the year 2002, I want all the rows where either the OrderDate is in 2002, or the ShipDate is in 2002, or when the OrderDate is before 2002 and the ShipDate is after 2002. Finally if I make a multiselect when Date is on a filter, for example if I multiselect the 1st January 2002, the 5th of January 2002 and the 10th January 2002, I want to count the number of rows where the OrderDate-ShipDate range includes any of those three dates.

I spent several hours thinking how I could do this in DAX using just the columns I had and I came up with the following formula, which gave me the results I was expecting… when it ever returned, because it was horrendously slow:

=CALCULATE(
COUNTROWS(‘FactInternetSales’)
, FILTER(‘FactInternetSales’
, COUNTROWS(
FILTER(
VALUES(‘DimDate’[DateKey]),
(‘FactInternetSales’[OrderDateKey]<=’DimDate’[DateKey])
&&
(‘FactInternetSales’[ShipDateKey]>=’DimDate’[DateKey])
))>0))

It’s using the Calculate function to evaluate the expression CountRows(FactInternetSales); what it’s trying to do in the outermost filter is to filter all of FactInternetSales, and find all the rows where there is at least one Date in the current selection that is between the OrderDateKey and the ShipDateKey. But, as we learned last week, doing a filter on all the columns in a table is very inefficient, and what we actually want to do is to filter on just the columns we’re interested in: OrderDateKey and ShipDateKey. However what I found was that I really needed all of the distinct combinations of OrderDateKey and ShipDateKey, but I could only get either all of the columns in a table using the table name, or the distinct set of values in a single column using the Values() function. Therefore I couldn’t actually filter on the distinct combinations of OrderDateKey and ShipDateKey where OrderDateKey>=DateKey>=ShipDateKey. This seems like a limitation of DAX from what I can see, though it might turn out that there is a way of doing it – if there is I will of course blog about it.

Anyway, the solution was to create a new column that did contain the distinct combinations of OrderDateKey and ShipDateKey. I went into the PowerPivot designer and created a new column called OrderShip on FactInternetSales with the following expression:

=CONCATENATE(‘FactInternetSales’[OrderDateKey],’FactInternetSales’[ShipDateKey])

As you can see, it simply concatenates the OrderDateKey and ShipDateKey values into a 16 character string:

image

I was then able to use that column in a filter as follows, extracting the OrderDateKey and the ShipDateKey from the string using the Left and Right functions, and turning them back into numbers using the Value function:

=Calculate(
CountRows(FactInternetSales)
, FILTER(
VALUES(‘FactInternetSales’[OrderShip])
, COUNTROWS(
FILTER(
VALUES(‘DimDate’[DateKey]),
(‘DimDate’[DateKey]>=VALUE(LEFT(‘FactInternetSales’[OrderShip], 8)))
&&
(‘DimDate’[DateKey]<=VALUE(RIGHT(‘FactInternetSales’[OrderShip],8)))
))>0))

The logic is the same as with the previous expression, with the all-important exception that the outer Filter only works on the distinct values of OrderShip. It works and it’s fast, so I’m relatively happy although I would like to find a more elegant solution.

It not only gives the correct values for individual dates:

image

but also for Years (and other aggregations of Dates – also notice how the sum of the years is greater than the same as the Grand Total, because of course an order can be in progress in two different years and so needs to be counted in each):

image

and also for multiselect (where again the value is not the sum of the values of the three individual dates):

image

image

I think this illustrates one area where DAX is superior to MDX. MDX is notoriously bad at handling multiselect, but as Marius explained to me in DAX multiselect support is built-in: the Values() function is rather like a CurrentSet function, returning the set of all selected (‘existing’ in MDX terms) values in a column in a particular context.

So, another problem solved… time to think up the next challenge!

Written by Chris Webb

January 12, 2010 at 10:47 pm

Posted in DAX

Follow

Get every new post delivered to your Inbox.

Join 3,144 other followers