Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for June 2005

Measure Expressions

with 6 comments

I know this blog has been rather light on useful content (as opposed to links) recently, and I’m sorry – it’s not that I’m not learning anything new, I am, but much of it is going into my chapters for MDX Solutions 2E rather than on here. I’ll try to get round to blogging about stuff like the HAVING clause in MDX and the MemberValue() function sometime soon if I can do so without giving away all the juicy information that you’ll want to buy the book for, and if Mosha (or even BOL) don’t beat me to it…
 
But here’s a cool cube design feature that doesn’t cause any conflict of interest for me – measure expessions. The easiest way to explain what they are is to explain one scenario where they’re useful. In AS2K, if you’ve tried to model currency conversion, you’re probably aware that you need to multiply the measure values in your main fact table by the currency rate before any aggregation of the resulting values takes place. You then have two choices for your cube: either precalculate the values in the fact table itself or in a view, which leads to much faster queries but also much bigger cubes, and which means the rates can’t then be altered without reprocessing; or do the currency conversion at runtime using MDX, which generally leads to slower queries but which allows users to change the currency rates dynamically (for example using writeback). Neither of these options are exactly ideal so in AS2005 measure expressions offer a kind of halfway house – they are calculated at query time and yet are much faster than the equivalent MDX, but the price you pay is that they are nowhere near as flexible as calculated members in terms of the calculations you can define.
 
So what does a measure expression look like and where do you set it? In Visual Studio, in the cube tab of the cube editor, you’ll see the MeasureExpression property in the properties pane when you highlight a measure. Click on this and you get a text box where you can enter your measure expression. This has to be of the form
M1 op M2
where M1 is a measure from the same measure group as the measure you’re currently editing, M2 is a measure from a different measure group which shares at least one dimension with the current measure group, and op is either the * or / operator. Here’s an example:
[Measures].[Sales]*[Measures].[Currency Rate]
In previous builds AS has been very picky about having square brackets around the measure names and things like that, so if you’re trying this yourself make sure you copy the above example very closely. There are several other restrictions too, such as the fact that neither of the measures can be distinct count measures, so you can see that your options here are quite limited. That said for most scenarios they should offer all the flexibility you need to do the job.
 
How much faster are they than normal MDX, then? That’s the million dollar question, and I have to say that I don’t know yet. Even if I did I’m sure there would be some clause in the beta licence that stopped me from publishing performance data, and anyway it’s not fair to test performance on a product that hasn’t been released yet. Hopefully it will make a big enough improvement to make the headaches that accompanied this kind of requrement in AS2K a thing of the past. If anyone else out there has used this functionality and would care to share their experiences I’d be interested in hearing from them…
 
Measure expressions also allow you to do some really cool things with many-to-many dimensions too, but I’m going to save that for a future blog entry; this one’s already long enough.
 
UPDATE: No need to blog on measure expressions and m2m dimensions, actually, because Richard Tkachuk covers the subject in his white paper on m2m dimensions here.
 

Written by Chris Webb

June 30, 2005 at 1:32 pm

Posted in Analysis Services

Denny Lee’s blog

leave a comment »

Courtesy of Technorati, I’ve discovered Denny Lee’s (new?) blog, at:
http://spaces.msn.com/members/denster

Only a few postings so far but some good stuff on his presentation list, especially the ‘Enterprise OLAP 32-bit Analytics’ ppt and the ‘Troubleshooting OLAP’ document. I remember Denny from the early days of microsoft.public.sqlserver.olap and he really knows what he’s talking about, so this should be one to watch…

Written by Chris Webb

June 23, 2005 at 12:10 pm

Posted in On the internet

“A Jump Start to SQL Server BI” ebook

with one comment

An ebook called "A Jump Start to SQL Server BI" is available at
http://www.windowsitlibrary.com/Ebooks/sqlserverbi/Index.cfm

I’ll admit that I’ve not read it thoroughly, but after a quick skim the content seems to be a bit all over the place; still, there’s some interesting stuff in there if you look.

Written by Chris Webb

June 22, 2005 at 10:35 am

Posted in On the internet

MDX Tuning Webcast

leave a comment »

I’ve just come across this new webcast on MDX tuning, covering the following topics:

  • Non Empty Behavior property
  • Remoting queries using Execution Location=3;Default Isolation Mode=1
  • Cache Policy=7
  • Cache Ratio and Cache Ratio2
  • Cache Warming

Even though it’s only 20 minutes long, it is very, very useful. It contains lots of detailed information on how each of these optimisation tricks works, some of which I’ve never seen discussed publicly before.

My only grumble is why this stuff hasn’t been released before! Some of this functionality has been around for a few years now and yet it has never been documented properly. This stuff should be included in KB articles whenever a service pack is released.

Written by Chris Webb

June 21, 2005 at 10:47 am

Posted in On the internet

LinkShare: my idea for a $50000-prize winning app

leave a comment »

As I said, I don’t have the time to enter the Connected Systems Developer competition that I blogged about the other week, but that hasn’t stopped me thinking about what I might build if I did enter. The following idea came to me at around 2am this morning when I was desperately trying to get my 22-month-old daughter to go back to sleep, and having nothing better to do this afternoon I thought I’d bounce it off anyone reading my blog. So comments are invited – even if they are just to say that it’s a rubbish idea and/or someone’s thought of it before and/or it’ll never work. I have after all categorised this post under ‘Random Thoughts’!

Business Case:

In the modern office everyone does a lot of web surfing; some of it might even be business-related. And whenever we see something interesting we typically copy the link into a mail, add a few words of explanation and send it on to a few people who might also want to have a look. I send at least two or three such emails a day. For the typically lazy web surfer, though, this process is a bit of a hassle so we only bother to do it when we think the link is really interesting and (because we don’t want to get a reputation as the office spammer) we only send it to a small number of people we know who we think are going to find it interesting too. It’s my contention that it would be cool if we could share more of these links with more people.

So, we need to solve three problems in our quest to share the interesting links we find during our daily surfing:
1) It needs to be easier to share the link once we’ve found it
2) We don’t want to send stuff on to people who aren’t going to be interested, and we want to receive only the links that we’re going to be interested in.
3) We’d like to be able to share links with people who we don’t actually know well enough to contact directly.

Of course there are plenty of existing ways that people share links, such as newsgroups, email discussion lists and blogs but they typically only address the third of the above problems fully, the second only partially and the first not very well at all. For instance, anyone reading my blog is presumably doing so because they’re interested in Microsoft’s BI tools and they’re going to be interested in any links to webcasts, articles etc that I post up, but if they’re like me they subscribe to upwards of a hundred rss feeds – and that’s only on subjects they’re really interested in – so we still have the proverbial information overload. The same goes for email discussion lists and newsgroups. And in all these cases in order to share information you have to open an email, write a blog post etc, which all require effort.

Let me give you an idea of the kind of scenario I want to tackle. This morning I was reading this story on the Register, and followed a link on a whim to this page, a set of pictures of Cybermen with funny captions. It brought a smile to my face but I didn’t send it on to anyone else because a) it didn’t seem worth the bother, and b) I didn’t know whether any of the people I usually send stuff onto were at least mildly into Dr Who in the way I am. I’m not going to blog about it because it’s not relevant to BI, I don’t subscribe to any Dr Who blogs, dls or newsgroups because I’m not that much of a Dr Who fan, and so no-one else is going to see it. Which is a shame.

Functional Spec:

Anyway, enough waffle about the theory. The solution I’m thinking of would consist of something like the following:

  • An IE toolbar with only two controls on it: a button saying "This is a cool page" which you hit whenever you find an interesting link (regardless of whether you’ve found it yourself or it has been recommended to you by the system), and a textbox which allows you to add a short commentary on the contents of the page if you want. Whenever you hit the button, it sends the current url in your browser plus any comments to a web service which…
  • …Puts the information in a queue on a server. There’s an app which gradually works its way through every link submitted, retrieves the page, strips the text from the html and does some funky text mining on this and the comments you’ve submitted, and classifies it. This is then used by…
  • …Another server app which looks at your tastes (based on pages you’ve submitted in the past and perhaps other users who you’ve said you’re interested in seeing links from) and then using some more data mining gives you a short list of recently submitted links that you might be interested in, along with the comments of the people who have recommended them. This could either come in the form of a web page, a customised rss feed or a regular email newsletter.

So, in practice, let’s imagine it working as follows. Chris, Jon and Colin all work in a large corporation, in the same team doing the same kind of BI stuff. During his morning surfing, Chris submits 5-10 links; one, on a new feature of MDX, gets recommended automatically to Jon and Colin because everyone in the team works with MDX and has submitted MDX-related pages in the past. One, containing pictures of Cybermen with amusing comments, gets recommended only to Colin and only appears about halfway down his list because he’s a bit of a sci-fi fan and has submitted a few sci-fi links in the past. Meanwhile, David, who works in a different team and doesn’t know Chris, Jon or Colin finds a cool article on C-Omega and submits it so it gets recommended to the rest of his team; they all in turn click their buttons and so it eventually appears at the top of Jon’s list (because he’s really into coding) and somewhere down the list for Chris (because he’s not so into coding, but this is a really cool article nonetheless).

The larger the number of users with similar taste, the better it should work – more links submitted plus more people voting on the same links, and so the mining models can get to know people’s tastes much more quickly. I could imagine it doing well as an intranet app at a large tech company. It would probably need to give more priority to newer links (people want the latest stuff, and you don’t want old but popular links clogging up your recommendations) and maybe have some way of removing links you’ve already seen from your list of recommendations. One other extra feature that occurred to me was that the app could also generate a report showing the users who submitted the most interesting links, so as to generate a bit of rivalry and encourage future usage.

The key to it all though is the fact that all you need to do to submit a page is click a button in IE – the absolute minimum effort possible – and the fact that the job of the mining model is clear – recommend a page which will make you click your button in turn.

Technology:

It should be fairly straightforward to build the toolbar and the web service. Qualification for the competition comes with the use of SQL 2005 for storing all the data, SQLIS to do the processing, AS to do the data mining, and RS to do the web-based reports, daily email, even the rss feed (maybe as a custom rendering extension?). I’ll admit that I don’t know enough about data mining to know whether that bit will really work, but hey, it might.

 

OK, enough fantasising. If anyone does implement this and enters the competition, please can I have a share of the winnings?

 

Written by Chris Webb

June 17, 2005 at 4:22 pm

Posted in Random Thoughts

MDX Scripts and Calculated Members

with one comment

The other thing that struck me about the Time Intelligence article that I was talking about in my last post (read it here) is the way that Rob and Mosha have used MDX Scripts and calculated members together to solve a problem that, traditionally, I would have solved using calculated members alone, and the fact that this new approach is more effecient. The key paragraphs are on the second page:

To determine which state the current coordinate falls into, you’d typically use the IIF function (in Analysis Services 2000), or you could use the CASE operator (new to Analysis Services 2005). However, this approach results in dynamic checks during runtime evaluation of each cell, and the resulting MDX expressions tend to quickly become cumbersome and difficult to debug as the levels of nesting increase.

The Business Intelligence Wizard provides a much more efficient and elegant approach. The wizard uses the new MDX scripting syntax constructs to define specific scopes to which each calculation applies. Because Analysis Services evaluates SCOPE statements statically (once) when the MDX script executes, this technique inherently avoids unnecessary per-cell runtime checks.

Let’s take a look at one of the calculations they use in their article to understand exactly what this means, the 12 Month Moving Average. In AS2K, you’d typically write this as something like:

WITH MEMBER MEASURES.MOV AS ‘IIF(TIME.CURRENTMEMBER.LEVEL IS TIME.MONTH,
AVG(TIME.CURRENTMEMBER.LAG(11) : TIME.CURRENTMEMBER, MEASURES.[UNIT SALES])
, NULL)’
SELECT {MEASURES.[UNIT SALES], MEASURES.MOV} ON 0,
TIME.MEMBERS ON 1
FROM SALES

The important thing to note here is that because we only want this value to appear at the month level, we have to include the IIF(TIME.CURRENTMEMBER.LEVEL IS TIME.MONTH… check in the calculated member definition. And of course, this gets evaluated every time the calculated member is evaluated.

What Mosha and Rob are saying is that MDX Scripting allows you to avoid this unnecessary overhead. If you look at their version (at the bottom of this page) they still create a calculated member but it doesn’t do the work – this is done later by setting the scope to that calculated member and the month level, and putting the formula there. As they say in the second paragraph I quote above, because the scope is evaluated only once rather than on a per-cell basis, it’s more efficient.

The big question is, how much more efficient is this? I have no idea at this stage, but my guess is that for time intelligence-type calculations you’re probably only going to notice the difference on very, very big queries. However this approach doesn’t just apply to time intelligence-type calculations, it applies to lots of others too; and I bet for complex financial calculations that really hammer your CPU you’ll see a much bigger improvement on even the most basic queries. I’ll have to do some testing to find out.

Just goes to show how much we’re going to have to change the way we think when designing apps for AS2005…

 

Written by Chris Webb

June 10, 2005 at 1:29 pm

Posted in MDX

Null and the colon operator

with 2 comments

I spotted another minor, but interesting new bit of functionality in AS2005 MDX while reading the new article on time intelligence that Mosha links to here - when using the colon operator to get a range of members, you can now use null on one side to denote either the first member or last member on the same level as the member explicity mentioned. So, for instance:

select [Measures].[Internet Sales Amount] on 0,
null : [Date].[Day of Week].[Day of Week].&[3]
on 1
from
[Adventure Works]

returns the members from Day 1 up to and including Day 3 on the Day of Week attribute; similarly:

select [Measures].[Internet Sales Amount] on 0,
[Date].[Day of Week].[Day of Week].&[3] : null
on 1
from
[Adventure Works]

returns the members from Day 3 up to and including Day 7. This will definitely make many calcs a lot more readable.

 

Written by Chris Webb

June 9, 2005 at 3:21 pm

Posted in MDX

Release date for SQL2005 announced

leave a comment »

Just got the email through – SQL2005 will be released in ‘the week of November 7′. Oh, and the June CTP has been released too.

Written by Chris Webb

June 7, 2005 at 3:30 pm

Posted in Analysis Services

Attributes and Relationships

with 3 comments

I was just thinking today that I ought to cannibalise the presentation I gave at PASS Europe for a blog posting, and now I find that Steve McHugh has beaten me to it:
http://stevemchugh.blogspot.com/2005/05/sql-pass-2005-new-cube-design-features.html
http://stevemchugh.blogspot.com/2005/06/new-cube-design-features-part-2-more.html

Well, it saves me the effort! The original presentation is available to download for PASS members at
http://www.sqlpass.org/

But just to reiterate the most important point – build those relationships between your attribute hierarchies, because they’re the only way that AS can make use of aggregations at ‘lower’ levels. Not having them there could impact query performance in a big way. And one more thing - when Dave Wickert covered some of the same subject matter in one of his presentations he made a point I didn’t know about: that when the aggregation design wizard runs it favours long, complete chains of relationships. I guess this is because there’s more chance that a given aggregation can be reused by queries higher up the relationship chain. 

If it’s not too late it would be great if the dev team could insert a message in the dimension wizard reminding people to build these relationships, and maybe even make it a bit more obvious where to do this in the Edit Dimension tab (it’s a bit hidden away if you ask me). I’m sure that otherwise people will forget to do it and wonder why their queries are crawling.

Written by Chris Webb

June 7, 2005 at 3:23 pm

Posted in Analysis Services

CASE statements

with one comment

One of the first things I noticed when I got hold of AS2005 for the first time, well over a year ago now, was that we now had a CASE statement in MDX. The joy! And yet it’s one of those small improvements which so often gets ignored, so I thought I’d mention it. Here’s an example:

WITH
MEMBER MEASURES.MYTEST AS
CASE
WHEN [Measures].[Internet Sales Amount]> 3780000 THEN "HIGH"
WHEN [Measures].[Internet Sales Amount]> 3700000 THEN "MEDIUM"
ELSE "LOW"
END

SELECT {[Measures].[Internet Sales Amount],MEASURES.MYTEST} ON 0,
[Date].[Day of Week].[Day of Week].MEMBERS ON 1
FROM [Adventure Works]

No more nested IIFs! One of the biggest contributions to improved productivity in AS2005 in my opinion…

Written by Chris Webb

June 7, 2005 at 12:01 pm

Posted in MDX

Follow

Get every new post delivered to your Inbox.

Join 2,857 other followers