Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for the ‘Random Thoughts’ Category

Google Wave, Google Squared and Thinking Outside the Cube

with 2 comments

So, like everyone else this week I was impressed with the Google Wave demo, and like everyone else in the BI industry had some rudimentary thoughts about how it could be used in a BI context. Certainly a collaboration/discussion/information sharing tool like Wave is very relevant to BI: Microsoft is of course heavily promoting Sharepoint for BI (although I don’t see it used all that much at my customers, and indeed many BI consultants don’t like using it because it adds a lot of extra complexity) and cloud-based BI tools like Good Data are already doing something similar. What it could be used for is one thing; whether it will actually gain any BI functionality is another and that’s why I was interested to see the folks at DSPanel not only blog about the BI applications of Wave:
http://beyondbi.wordpress.com/2009/06/01/google-wave-the-new-face-of-bi/
…but also announce that their Performance Canvas product will support it:
http://www.dspanel.com/2009-jun-02/dspanel-performance-canvas-adds-business-intelligence-to-google-wave/
It turns out that the Wave API (this article has a good discussion of it) makes it very easy for them to do this. A lot of people are talking about Wave as a Sharepoint-killer, and while I’m not sure that’s a fair comparison I think it’s significant that DSPanel, a company that has a strong history in Sharepoint and Microsoft BI, is making this move. It’s not only an intelligent, positive step for them, but I can’t help but wonder whether Microsoft’s encroachment onto DSPanel’s old market with PerformancePoint has helped spur them on. It’s reminiscent of how Panorama started looking towards SAP and Google after the Proclarity acquisition put them in direct competition with Microsoft…

Meanwhile, Google Squared has also gone live and I had a play with it yesterday (see here for a quick overview). I wasn’t particularly impressed with the quality of the data I was getting back in my squares though. Take the following search:
http://www.google.com/squared/search?q=MDX+functions#
The first results displayed are very good, but then click Add Next Ten Items and take a look at the description for the TopCount function, or the picture for the VarianceP function:
squared

That said, it’s still early days and of course it does a much better job with this search than Wolfram Alpha, which has no idea what MDX is and won’t until someone deliberately loads that data into it. I guess tools like Google Squared will return better data the closer we get to a semantic web.

I suppose what I (and everyone else) like about both of these tools is that they are different, they represent a new take on a problem, unencumbered by the past. With regard to Wave, a lot of people have been pointing out how Microsoft could not come up with something similar because they are weighed down by their investment in existing enterprise software and the existing way of doing things; the need to keep existing customers of Exchange, Office, Live Messenger etc happy by doing more of the same thing, adding more features, means they can’t take a step back and do something radically new. Take the example of how, after overwhelming pressure from existing SQL Server users, SQL Data Services has basically become a cloud-based, hosted version of SQL Server with all the limitations that kind of fudge involves. I’m sure cloud-based databases will one day be able to do all of the kind of things we can do today with databases, but I very much doubt they will look like today’s databases just running on the cloud. It seems like a failure of imagination and of nerve on the part of Microsoft.

It follows from what I’ve just said that while I would like to see some kind of cloud-based Analysis Services one day, I would be more excited by some radically new form of cloud-based database for BI. With all the emphasis today on collaboration and doing BI in Excel (as with Gemini), I can’t help but think that I’d like to see some kind of hybrid of OLAP and spreadsheets – after all, in the past they were much more closely interlinked. When I saw the demos of Fluidinfo on Robert Scoble’s blog I had a sense of this being something like what I’d want, with the emphasis more on spreadsheet than Wiki; similarly when I see what eXpresso is doing with Excel collaboration it also seems to be another part of the solution; and there are any number of other tools out that I could mention that do OLAP-y, spreadsheet-y type stuff (Gemini again, for example) that are almost there but somehow don’t fuse the database and spreadsheet as tightly as I’d like. Probably the closest I’ve seen anyone come to what I’ve got in mind is Richard Tanler in this article:
http://www.sandhill.com/opinion/daily_blog.php?id=45
But even then he makes a distinction between the spreadsheet and the data warehouse. I’d like to see, instead of an Analysis Services cube, a kind of cloud-based mega-spreadsheet, parts of which I could structure in a cube-like way, that I could load data into, where only I could modify the cube-like structures containing the data, where I could define multi-dimensional queries and calculations in an MDX-y but also Excel-y  and perhaps SQL-y type way – where a range or a worksheet also behaved like a table, and where multiple ranges or worksheets could be joined, where they could be stacked together into multidimensional structures, where they could even be made to represent objects. It would also be important that my users worked in essentially the same environment, accessing this data in what would in effect be their own part of the spreadsheet, entering their own data into other parts of it, and doing the things they love to do in Excel today with data either through formulas, tables bound to queries, pivot tables or charts. The spreadsheet database would of course be integrated into the rest of the online environment so users could take that data, share it, comment on it and collaborate using something like Wave; and also so that I as a developer could suck in data in from other cloud-based data stores and other places on the (semantic) web – for example being able to bind a Google Square into a range in a worksheet.

Ah well, enough dreaming. I’m glad I’ve got that off my chest: some of those ideas have been floating around my head for a few months now. Time to get on with some real work!

Written by Chris Webb

June 4, 2009 at 3:58 pm

Posted in Random Thoughts

Fourth Blog Birthday

leave a comment »

For the second year running I’m late celebrating my blog birthday (it was yesterday); my only excuse is that I’m still reeling from the amount I’ve eaten over the last two weeks. But four years of blogging… wow… it feels like ages.

And to a certain extent I feel that, after all this time, I’m running out of things to say here. The actual writing of blog entries isn’t a problem, it’s more the problem of having something to write about. Part of the problem is me: I don’t want to write about things I don’t find interesting so I haven’t gone down the route of turning the blog into a MDX tutorial (Bill Pearson does that much better than I ever could, and only Mosha could ever cover the advanced stuff properly), but at the same time I’m not coming across so many MDX/SSAS issues or obscure features as I used to. Part of the problem is, too, that SSAS2008 was so light on new features that it didn’t provide me with much new to write about. So I’m hoping that Gemini, Kilimanjaro, Excel 14, Azure etc will give me something to get my teeth into in 2009; I’m sure they will. If not, well, I’ve always wanted to spend some time getting into Mondrian and other open source BI technologies. And with the economy the way it is I suppose I’ll have a lot more spare time for learning new stuff in the coming twelve months…

But anyway, bear with me and keep reading! For those of you who have stuck with me for the last four years, thanks, and best wishes for 2009.

Written by Chris Webb

December 31, 2008 at 10:50 pm

Posted in Random Thoughts

Why can’t we just draw our own reports?

with one comment

Here’s a way-out thought I had over the Xmas break for a new approach to building BI reports….

Have you, when you’ve asked a typical non-technical business user what they want a report to look like, asked them to draw a quick sketch? I do all the time – I find seeing what the user wants the report layout to look like is much the best way to understand what they want and for the user it’s the best way to express their requirements. So on the back of the proverbial envelope you’d get something like this:

handwrittenreport

…and then go back to your desk and write the query and design the report layout in something like SSRS. So – why can’t we cut out a step and go direct from the sketch to the report design? I can see two options for the first step here:

  • Using a tablet PC you write some software that works a bit like OneNote, but where the user can draw the outline of a report freehand. Unfortunately tablet PCs just aren’t that common and the really tech-phobic end user wouldn’t feel comfortable using one.
  • The user draws the report on paper and then the drawing gets scanned; definitely something the most computer illiterate manager would be comfortable with.

You would then take the freehand drawing and:

  • Interpret the freehand lines into the borders of a table, and
  • Interpret the text on columns and rows as either
    • Explicit selections of members, or
    • Set expressions
  • Apply a lot of smarts to format the report in ways that conform to the best practices laid down by the likes of Stephen Few et al. Almost no business or IT people (me included) have any idea on how to format reports properly, and while you could argue that this might be intrusive I think users would appreciate a tool that did this for them.
  • Apply a standard corporate template, with the appropriate logos etc in place.

Working out what the borders of a table should be from a freehand drawing must be possible (although implementation would be well beyond me). Interpreting what the user has written they want on columns and rows would present more problems:

  • Handwriting recognition is notoriously difficult to do well, and usually the software needs to have a bit of practice to get good. On the other hand, in this particular scenario it should be easier because the user won’t be writing just any old text. For instance, if we assume that we’re working with an Analysis Services data source, we know that any text is either going to be the name of a member or something that will resolve to a set expression; we also know, for instance, if there are what look like multiple member names on the same position on the same axis they will all have to come from the same hierarchy.
  • Resolving text to member names is all very well, but turning sentences into set expressions would be trickier. It seems reasonable to think that phrases like "Products where sales is greater than £100" could be interpreted effectively, but whether your average business user can write something as clear as that is debatable. Any tool would certainly have to prompt the user to confirm that the interpretations it has made was correct, and do to that it would have to resort to the kind of techy interface that the tool is trying to get away from.
  • Similarly there are going to be ambiguities that need to be resolved when looking at a the design. For example, the drawing might have the years 2006, 2007 and 2008 on columns. But does this mean the report should always have these three years on columns, or the last three years with data, or something else?

So it certainly wouldn’t work like magic, but at the same time I think it would offer some advantages over current report design tools, the designers of which have fallen into the trap of building a UI on top of the functionality they’ve got  available in MDX or SQL, rather than building a UI for what the user actually wants to do. After all, don’t you think that it’s actually very difficult to lay out anything other than the most simplistic reports in most report design tools, compared to how easy it would be to draw the report?

Written by Chris Webb

December 30, 2008 at 4:50 pm

Posted in Random Thoughts

Interesting stuff coming from Microsoft soon

with one comment

A couple of interesting (and possibly BI-related) technologies are coming soon from Microsoft:

  • I see today via Nick Carr that soon we’ll be able to run Windows and SQL Server on Amazon Elastic Compute Cloud (EC2). I wonder if that includes Analysis Services too? If so, that would be handy.
  • Also announced today, the new MS cloud operating system, coming within the month
  • Windows High Performance Computing (HPC) server is due to launch at the beginning of November (see here for a news report on it, here and here for some details). Hmm, I see SQL Server is listed on the ‘supported applications’ page… surely there’s got to be some kind of tie-in here with the whole MatrixDB/DATAllegro MPP stuff?

Written by Chris Webb

October 1, 2008 at 5:16 pm

Posted in Random Thoughts

Enterprise Search and BI

with one comment

I notice from various sources (for example, Don Dodge) that Microsoft have released a free version of their Enterprise Search product, Microsoft Search Server 2008 Express. The thing that caught my eye was the list of federated connectors:

http://www.microsoft.com/enterprisesearch/connectors/federated.aspx

…which includes Business Objects, Cognos, SAS, but there’s no mention of Reporting Services or Analysis Services anywhere. As I think I’ve said here before, I’m not convinced that a search interface on top of a BI platform is going to be useful in the real world (though I bet you could do some cool demos with it): I suppose if you have hundreds of SSRS reports for example you might want to look for the ones that contain figures for a particular Product or Customer, but I would have thought that it’s just as likely that you’d do a search, find a report and then find you don’t have permission to view it. As for using a search interface as a way of querying a cube, all I have to say about that is two words: English Query.

But I think there’s a more interesting application for BI here: what if you could build a cube off the index this thing creates? You could have dimensions like Date Updated, Keyword, File Type and Path, and measures like Count of Files and File Size; you’d be able to do things like create reports which tracked the overall space taken by mp3 files on your network and where these files were, the number of emails with the phrase "new job" in; even just browsing ad hoc in Excel you’d have a new way of searching for files: for example, you could slice on File Type=Word doc, Keyword="CV" or "Resume" and then put the Path dimension on rows and drill down to find all the CVs on your network.

Written by Chris Webb

November 8, 2007 at 10:06 pm

Posted in Random Thoughts

Adapting SQLIS to work with tuples and sets as a data source

with 3 comments

It’s been a long time since I posted in my ‘random thoughts’ category… but I just had such an interesting idea I thought I’d post it up (even if there’s 0% chance I’ll ever get round to implementing this).
 
I was looking at a (non-Microsoft platform) BI tool today and got thinking about MDX, how people find it hard to work with, and how most client tools don’t really expose the power of MDX sets, and how handy it would be to be able to do some procedural things in MDX too. This particular tool had some cool set-based selection functionality and I reflected that even though I’d seen similar set-based selection tools, some on AS (didn’t Proclarity have something in this area?), they’d never really taken off; I also thought about the much-missed MDX Builder tool which had a similarly visual approach to building MDX expressions. I started thinking about whether it would be worth building another client tool which took this approach but quickly came to the conclusion that the world needed another AS client tool like a hole in the head, but realised that if I was going to build this kind of tool how much it would resemble Integration Services. And then I had my idea: why not extend Integration Services so it can treat MDX sets and tuples as a data source, and then use its existing functionality and create new transformations to implement MDX set-based operations?
 
Let me explain in more detail. I’m not talking about simply getting data out of AS in the same way you’d get it out of a SQL Server table, using an MDX query. What I’m saying is that what would be flowing though the IS data flow tasks would be members, sets and tuples: each ‘row’ of data would be an MDX expression returning member, or tuple, or set. So you’d create a custom data source where you could define a set as your starting point – probably at this point you’d just select a whole level, or the children of a member, or some such simple set of members. For example you might select the [Customers].[Customer].[Customer] level in your Customer dimension; the output from this would be a single text column and a single row containing the set expression [Customers].[Customers].[Customers].Members. You could then put this through an Exists() transform to return only the customers in the UK and France, the output from which would be the set expression Exists([Customer].[Customer].[Customer].Members, {[Customer].[Country].&[United Kingdom], [Customer].[Country].&[France]}). Similarly then you could put this through a Crossjoin() transform to crossjoin this set with the set of all your Products, then put the result through a NonEmpty() transform to remove all non empty combinations from the set. At this point your output would still be a single row and column, consisting of the MDX expression:

NonEmpty(
Crossjoin(
Exists(
[Customer].[Customer].[Customer].Members
, {[Customer].[Country].&[United Kingdom], [Customer].[Country].&[France]})
, [Product].[Product].[Product].Members)
, [Measures].[Internet Sales Amount])

So far, so dull though. All we’ve got is a way of building up a string containing an MDX set expression and SQLIS brings little to the party. But the real fun would start with two more custom transformations: SetToFlow and FlowToSet. The former would take an input containing MDX set expressions (and conceivably there could be more than one row, although we’ve only got one so far) and would output a flow containing all the tuples in the set(s) we’ve passed in. Taking the set above, the output would be the contents of measures.outputdemo in the following query on AdventureWorks:

with member measures.outputdemo as TupleToStr(
([Customer].[Customer].Currentmember, [Product].[Product].Currentmember)
)
select {measures.outputdemo} on 0,
NonEmpty(
Crossjoin(
Exists(
[Customer].[Customer].[Customer].Members
, {[Customer].[Country].&[United Kingdom], [Customer].[Country].&[France]})
, [Product].[Product].[Product].Members)
, [Measures].[Internet Sales Amount])
on 1
from
[Adventure Works]

The FlowToSet transform would do the opposite, ie take an input containing tuples and return a single row containing the set represented by the entire input. For the above example, this would be a big set:
{([Customer].[Customer].&[12650],[Product].[Product].&[214]), ([Customer].[Customer].&[12650],[Product].[Product].&[225]),…}
But the point of this would be that you could then apply more MDX set expressions efficiently, although of course there’s no reason why you can’t apply MDX set expressions to individual tuples in a data flow. The final important
custom transform you’d need would be an Evaluate transform, which would append one or more numeric or text columns to a tuple or set dataflow: each of these columns would be populated by evaluating an MDX expression which returned a value against the set or tuple for each row. So, for example, if a row contained a the set we’ve been using we could apply a the Count function to it and get the value 12301 back; if a row contained the tuple ([Customer].[Customer].&[12650],[Product].[Product].&[214]) we could ask for the value of this tuple for the measure [Internet Freight Cost] and get the value 0.87 back; or to the same tuple we could ask for the value of [Customer].[Customer].CurrentMember.Name and get back the value "Aaron L. Wright".
 
Of course the beauty of this is that once you’ve got a flow containing sets, tuples and numeric values retrieved from the cube for them then you can use all the cool existing SQLIS functionality too, like multicasts, lookups, UnionAlls, Aggregates etc to do stuff with your sets that is hard in pure MDX; and of course you can easily integrate other forms of data such as relational or XML, and do useful things at the end of it all like send an email to all your male customers in the UK who bought three or more products in the last year, or who live in London and have incomes in excess of £50000 and have averaged over £50 per purchase, or who have been identified as good customers by a data mining model, and who aren’t on the list of bad debtors that you’ve got from the Accounts department’s Excel spreadsheet.
 
Now of course all of this is possible with using only relational data with SQLIS, or even without using SQLIS and just using pure MDX. I guess the point of this is, as always, that it provides an easier way to do stuff: build MDX expressions without having to know much MDX, integrate AS data with other data and other applications without doing (much) coding, and so on.
 
So, as ever, I’d be interested in your comments on this. I have the distinct feeling that this is a solution in search of a problem… but if you can think of some problems it might solve, then let me know!

Written by Chris Webb

August 25, 2006 at 3:52 pm

Posted in Random Thoughts

Microsoft, BI and Search

with one comment

It’s inevitable, when you get a whole bunch of new functionality as we have with SQL2005, that you start thinking of the new types of applications that become possible. One of the things I’ve been thinking about for a while is how you could take the results of an RSS feed or a search engine search, do text mining on the results and build a cube to analyse what comes back. Quite an interesting idea, I think, and I know plenty of other people have been thinking along the same lines too, eg
And it’s not just in the Microsoft world that these ideas are cropping up. For example, only today I saw a reference to a (non-Microsoft) OLAP solution which built cubes from the results of text mining:
 
Anyway, on a different note, one of the fun things about blogging is all the rumours and snippets of information about new solutions coming soon, most of which I’m not really at liberty to discuss (not that I know much anyway). You get to put these snippets, rumours and other stuff you read on the web and put them together in a 1+1=3 operation… Here, for example, is a link that Jon-who-sits-next-to-me just sent which he saw on Slashdot:
How can Microsoft beat Google in the search game? There are some interesting hints on the second page of this article, for example:

He said that Microsoft’s goal — but not its initial offering — would go beyond finding URLs and instead focus in on the specific information sought by Internet users.

"Generally these days what you get back is URLs, and based upon research 50 percent of the time you do a search you don’t get the URL you’re looking for," he said.

Holloway said that the promise of Microsoft’s search capability is to dig down.

For example, he said, potential home-buyers might find a group of houses in the price range and with the precise amenities they are seeking.

Or a surfer might find a restaurant with the kind of menu a diner wants in a particular geographic area.

 
Hmm, is it me or is there a potential BI angle here? Dig down == drill down, perhaps? Slice, dice and analyse your resultset rather than just get a flat list of links? I wonder… 
 
UPDATE: Jon, bless his heart, has come up with another interesting link on this topic:
Don’t you just love wild speculation? The whole Origami thing is so last week… 
 
UPDATE#2: Now this could just be me reading way too much into something, but here’s another relevant link:
There’s a coincidence here that’s too good to be true…

Written by Chris Webb

March 2, 2006 at 2:24 pm

Posted in Random Thoughts

Follow

Get every new post delivered to your Inbox.

Join 3,240 other followers