Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Archive for the ‘Excel’ Category

Bringing It All Together In The Cloud and Excel

with 2 comments

A few things got released in the past week or so that I would normally have devoted a short blog post to; now I’ve finally made it onto Twitter I tend to just tweet about them instead, but I still think a blogging is the best way for me to get my thoughts together about what they actually mean. Let’s consider this ‘stuff’ in isolation first:

  • http://www.excelmashup.com/ went live. It’s a JavaScript API for the Excel Web App which is useful, but as Jamie Thomson notes here it’s not the API he and I have been asking for for a long time, alas.
  • The first CTP of Hadoop on Azure went live. Denny Lee’s blog post is probably the best place to start to learn about it; the video in the post is a very useful quick overview of what it does too. I noticed that it supports importing data from the Windows Azure Marketplace (what used to be called the Azure Datamarket).
  • The Azure Marketplace also got a speed upgrade, as Boyan Penev notes here. This is important because whenever I’d tried to use it in the past its appalling performance had stopped me doing anything interesting with it at all. This, plus the fact that you can now publish your own data there, turns what was something of a white elephant into what could be an integral part of the Microsoft cloud BI platform.
  • Version 2 of Google BigQuery went live, although it’s still in beta.
  • The CTP of Data Explorer got released, which of course I’ve blogged about already here and which Jamie blogged about here and here.
  • Microsoft announced Office 365 compliance with leading EU and US standards for data protection and security, which means less of those legal worries about whether you’re allowed to put all that interesting data you want to use for BI into the cloud.

From this it’s clear that Microsoft’s cloud BI platform is beginning to take shape, as are competing cloud BI platforms (if we assume that Google actually has a cloud BI strategy, and I think it has), and I think it’s fair to say Microsoft is well placed. There’s also yet more evidence, as if it was not blindingly obvious already, that Excel is at the heart of Microsoft’s BI strategy. Look at how data from Hadoop on Azure can be imported directly into Excel, and how this is flagged up as an important selling point, in the video on Denny’s blog post. However I think Microsoft needs to make even more of an effort to make everything Excel-like: now that it’s building a new BI platform from scratch it has a very rare opportunity to do this and to produce a truly coherent set of tools rather than the traditional grab-bag of technologies that make up the current Microsoft BI stack and which the cloud platform could also end up as too. Actually I’d like to go further and say that rather than have a bunch of separate cloud BI tools MS should make everything BI a feature of Excel, whether it be Excel on the desktop or Excel in the cloud. This might seem a controversial thing to say, but if MS is committed to true mass-market, self-service BI then Excel has to be the platform and MS needs to base its cloud BI strategy on it 100%.

Here are a couple of the things I’d like to see happen to achieve this:

  • Data Explorer’s expression language needs to be made consistent with Excel syntax, in the way that DAX is consistent with Excel syntax. While I like what Data Explorer can do at the moment I hate having to learn a new syntax and a new set of functions for each tool I use (think SSRS expressions, SSIS expressions, MDX and so on) and it makes me much less productive when I do. I want to use the same syntax to write formulas in my spreadsheet, calculations in DAX and calculations/transformations in Data Explorer too – after all, we want to do a lot of the same things in ETL and reporting like lookups, aggregation and pivoting.
  • Hadoop on Azure is an important tool to have, not least because Hadoop is already so widely used in the wider world so it adds some credibility to the MS BI stack, but I’d like to be able to control massively parallel calculations from inside Excel and not just consume the results of them there. I’m thinking something like DataScope or the way you can scale out Excel calculations on Windows HPC, maybe driven though a PowerPivot-like interface with calculations expressed in DAX, or Data Explorer (which links back to my last point, because if Data Explorer expressions were DAX it needn’t be an either/or choice). It could of course still be Hadoop at the back-end with a layer on top to make it consistent with Excel.
  • We desperately need a proper API for the Excel Web App (yes, here we go again…). An OData API for importing and exporting data from spreadsheets on the web is only the first step; in the long term I’d want the Excel Web App to have capabilities like Project Dirigible, so that Excel on the desktop could become a kind of client tool for a massively scalable Excel Server in the cloud (note that I didn’t say Excel Services, which is more like Excel-on-the-server than a real Excel Server in my opinion). I’d want to be able to store data in the cloud and automatically synchronise it with Excel on multiple desktops or elsewhere in the cloud. I’d also want to create Excel spreadsheets in the cloud that acted like functions (similar to Project Dirigible), where these functions could be called from other Excel spreadsheets which again could be on the desktop or be in the cloud.
  • We need Analysis Services in the cloud. We’ve known it’s been coming for a long time, but not what form it will take. Again, I’d like to see tight integration with Excel similar to how PowerPivot works on the client or tighter, where Analysis Services would not be a separate service but just be the functionality within Excel for manipulating tables of data for reporting purposes. I’d want Data Explorer to be able to create and load these tables.
  • We also need the full BI functionality of desktop Excel – pivot tables and cube formulas – in the Excel Web App. I would guess this is in the pipeline anyway since it’s a pretty obvious requirement if Microsoft’s cloud BI strategy is going to work. I also don’t see the point of Power View being a separate app – I’d like to see it become a data visualisation feature of Excel.
  • Finally, I’d like to see some way of tying all the data held in this gigantic data store together. You’d want to be able to search it, certainly, but also understand where it’s come from and what it actually represents. It sounds like maybe this is what Project Barcelona is aiming for

Coordinating the work of multiple teams at Microsoft in the way this would demand is a particularly thankless task, I know. But maybe, just maybe, someone senior (maybe a Technical Fellow like Amir…?) could pull off something this ambitious?

Written by Chris Webb

December 16, 2011 at 9:55 pm

Posted in Cloud, Excel

Sparklines for Excel

with 12 comments

While the introduction of native support for sparklines and other microcharts in Excel 2010 was welcome, Excel is still lacking more advanced visualisation features. I came across Sparklines for Excel – a free Excel addin that gives you a lot of extra charting options, not just sparklines – a while ago but I’ve only just got round to playing with it and I have to say it’s a lot of fun. I’m not much of a data visualisation expert (I’ll leave that to the likes of Jen) but it’s a subject that every BI professional needs a passing knowledge of and in any case it’s a shiny new toy to play with, so it’s worth a blog post.

What I like most of all about Sparklines for Excel is that everything is driven from Excel formulas, and no VBA is required. That means you can make every aspect of the charts you create data-driven, and this holds a fundamental appeal for the data geek in me.  Let’s take creating a treemap as an example, and start with an Excel 2010 worksheet hooked up to the Adventure Works cube using some Excel cube functions plus some thresholds telling us whether the values for Gross Profit Margin are good or bad:

image

We can then simply click on an empty cell and then click on the Treemap button in the ribbon, fill in some ranges, and we get the following formula:

=Treemap(D5:D10,C15:I29,,,E5:E10,G5:H9,C5:C10)

And this treemap in the worksheet (I won’t even try to apologies for the colour scheme):

image

Cool, eh? And of course, as soon as you change the dropdown filter to select another year, or change any of the threshold values, the treemap updates too. Even the position, length and width of the treemap itself can be parameterised.

You can see the full list of chart types – including heat maps, cascade charts and Pareto charts – in the manual here. It’s definitely worth checking out if you’re an SSAS or PowerPivot user who’s into data visualisation and on a tight budget.

Written by Chris Webb

November 30, 2011 at 5:52 pm

Posted in Excel, Visualisation

Excel DataScope

with one comment

Jamie Thomson just tipped me off about something new and very interesting – one week after I had a moan about Microsoft doing nothing about Excel and the cloud, here comes Excel DataScope:
http://research.microsoft.com/en-us/projects/azure/datascope.aspx

Here’s the blurb from the site:

From the familiar interface of Microsoft Excel, Excel DataScope enables researchers to accelerate data-driven decision making. It offers data analytics, machine learning, and information visualization by using Windows Azure for data and compute-intensive tasks. Its powerful analysis techniques are applicable to any type of data, ranging from web analytics to survey, environmental, or social data.

There are yet more tantalising details in the video and the two pdfs here:

http://research.microsoft.com/apps/video/?id=149888
http://research.microsoft.com/en-us/projects/azure/exceldatascope_11_poster1.pdf
http://research.microsoft.com/en-us/projects/azure/exceldatascope_11_poster2.pdf

I’m currently trying to find out more about all this, but there’s clearly a  ton of cool stuff here:

  • You can use the Live Labs Pivot control for visualisation.
  • It does data mining in the cloud. Is this the successor to the old data mining addin? The functionality is clearly the same.
  • There’s a section on Map Reduce running on Windows Azure on one of the posters. Is this using Dryad?

Is this a first glimpse of a new cloud-based BI platform from Microsoft? Will SSAS in the cloud form part of it? Before we all get too excited (or at least I get too excited) it’s worth noting that this is coming from the eXtreme Computing Group and not the SQL Server team, it’s clearly aimed at scientific rather than business users, and is described as “an ongoing research and development project”, ie it is not a commercial product. The potential is obvious though, and I hope it becomes something significant.

Written by Chris Webb

June 13, 2011 at 8:31 pm

Posted in Cloud, Data Mining, Excel

Tagged with

Excel 2010, Subselects, Named Sets and the Formula Cache

with one comment

Continuing the theme of the Formula Cache, you may remember a post from a while ago where I showed how using a subselect in a query forced query scope – so that SSAS was unable to cache the results of calculations for more than the lifetime of a single query. Now this is very significant if you have calculations that take a long time to evaluate and you’re using Excel as a client tool, because Excel makes extensive use of subselects in its queries.

For example, if we take the calculation ‘ExpensiveCalc’ from that previous post and use it in an Excel pivot table as below:

image

We’ll find that every time we refresh the pivot table it’s painfully slow. This is because we’ve selected just one Year on columns and Excel has generated the following MDX query, using a subselect, as a result:

SELECT
NON EMPTY
Hierarchize({DrilldownLevel({[Date].[Calendar Year].[All Periods]},,,INCLUDE_CALC_MEMBERS)})
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME
ON COLUMNS 
FROM (SELECT ({[Date].[Calendar Year].&[2001]}) ON COLUMNS 
FROM [Adventure Works])
WHERE ([Measures].[EXPENSIVECALC])
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

Not good. Luckily, we can avoid this happening in Excel 2010 by using the new named set functionality. If you go to the Pivot Table Tools/Options tab on the ribbon, and select ‘Create Set Based On Column Items’ from the Fields, Items & Sets menu:

image

…and create a new named set:

image

You’ll find that the MDX generated by Excel changes and there’s no subselect:

SELECT
NON EMPTY
{[Year 2001]}
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS 
FROM [Adventure Works]
WHERE ([Measures].[EXPENSIVECALC])
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

This means that although the pivot table will be slow to refresh when you click OK, on subsequent refreshes you will be able to benefit from the FE cache and the refresh will be practically instant. This is a very useful trick if your users have a number of Excel pivot tables they open on a regular basis; it won’t cure all performance problems but it’ll cure some at least.

Written by Chris Webb

April 6, 2011 at 12:58 pm

Excel 2010 and Windows HPC Server 2008 R2… and PowerPivot?

with 4 comments

Something I first heard about last year was the ability to run Excel calculations in parallel on a Windows HPC cluster; it’s been in the news again recently, with the release of Windows HPC Server 2008 R2:
http://www.theregister.co.uk/2010/09/20/microsoft_hpc_server_r2/

I did some digging and found the following white papers that give more details on how exactly you can do this:
Accelerating Excel 2010 with Windows HPC Server 2008 R2: Technical Overview
Accelerating Excel 2010 with Windows HPC Server 2008 R2: Building VBA applications and workbooks for a Windows HPC Cluster

Here’s a summary of what you can do, taken from the first paper above:

As models grow larger and workbooks become more complex, the value of the information generated increases. However, more complex workbooks also require more time to calculate. For complex analyses, it is not uncommon for users to spend hours, days, or even weeks completing such complex workbooks. The problem this white paper addresses is how organizations can reduce the calculation time required for long-running workbooks to give users faster access to business-critical information.

One solution is to use Windows® HPC Server 2008 R2 to scale Office Excel 2010 calculations across multiple nodes in a Windows high-performance computing (HPC) cluster in parallel. This paper presents three methods for running Office Excel 2010 calculations in a Windows HPC Server 2008 R2 based cluster: running Office Excel 2010 workbooks in a cluster, running Office Excel 2010 user-defined functions (UDFs) in a cluster; and using Office Excel 2010 as a cluster service-oriented architecture (SOA) client.

Windows HPC Server 2008 R2 now enables running multiple instances of Office Excel 2010 in a Windows HPC cluster, where each instance is running an independent calculation or iteration of the same workbook with a different dataset or parameters. This solution allows near-linear performance increases for iterative spreadsheets, such as those running a Monte Carlo algorithm.

Running Office Excel 2010 UDFs in a cluster is a new ability of Office Excel 2010 for running complex or time-consuming UDFs—functions contained in Excel link libraries (XLLs)—in a Windows HPC Server 2008 R2–based cluster. If a workbook includes long-running UDFs, moving calculations to the cluster can result in significant performance improvements.

Using the Windows HPC Pack software development kit (SDK), Office Excel 2010 can function as a cluster SOA client to run complex and time-consuming calculations across a set of servers in a Windows HPC cluster. Any Microsoft .NET or Component Object Model (COM) application can use the Windows HPC Pack SDK: This paper uses Microsoft Visual Studio® Tools for Office (VSTO) to construct an Office Excel 2010 add-in that connects to the cluster as a cluster SOA client.

From a purely Excel point of view, this is very interesting – and it’s no wonder that Excel power users everywhere are drooling over it. The obvious next question is: can I use PowerPivot with this as well? From reading the papers I don’t see why not. I’m not talking about simply taking a pivot table connected to a single PowerPivot model and somehow making it all run faster – I don’t think that would work (yet?) – but I can imagine a scenario where you used VBA and the Excel cube functions, which in turn got the value of DAX calculated measures in PowerPivot models distributed over a cluster, as part of a larger solution for example. Now how cool would that be? Sadly I don’t have a HPC cluster to test this on, but if anyone does have one and has tried this, please let me know…!

Written by Chris Webb

September 22, 2010 at 9:30 pm

Posted in Excel

The Excel Web App and its missing API

with 13 comments

A few weeks ago I was playing around with the Excel Web App and thinking that, while it’s better than the Google equivalent in some ways, the lack of an API seriously limited its usefulness. So I posted a question on the Excel Web App forum asking what the plans for an API were and got the following answer:

Currently there are no APIs exposed for Excel Web App, and we are not sure if this will be available in the future.

This was more than a little disappointing, to say the least… So I replied with reasons why an API would be a good idea and got Jamie to join in (he and I have very similar views on this type of subject) as well. You can read the whole thread here:
http://social.answers.microsoft.com/Forums/en-US/officewebapps/thread/eaa565f1-83b6-4e7c-a1ec-af31fc3d7a73

In summary, what I’d like to see is the Excel Web App be able to do the following:

  • Consume data from multiple data source types, such as OData, and display that data in a table
  • Expose the data in a range or a table as an OData feed

I think it would enable all kinds of interesting scenarios where you need data to be both human-readable and also, at the same time, machine-readable: for example, imagine being able to publish data in a spreadsheet online and then have your business partners consume that data in PowerPivot at the click of a button. The posts on the thread go into a lot more detail so I’d encourage you to read it; also Jon Udell picked up the issue and blogged about it here:
http://blog.jonudell.net/2010/06/30/web-spreadsheets-for-humans-and-machines/ 

And now I need your help: the Excel Web App dev team asked for specific scenarios where an API would prove useful and both Jamie and I provided some, but I think the more we get (and the more people show that they want an API) the better. So if you’ve got some ideas on how you would use an API for the Excel Web App then please post them on the thread! The more noise we make, the more likely it is we can change the dev team’s mind.

Written by Chris Webb

July 2, 2010 at 11:21 pm

Posted in Excel

NodeXL, Network Graphs and the Eurovision Song Contest

with 3 comments

Via the Perceptual Edge blog, earlier this week I came across a cool new open-source Excel addin, developed by Microsoft Research and various people at universities around the world, called NodeXL. You can download it from Codeplex here:
http://nodexl.codeplex.com/
Marc Smith’s blog also has a lot of information on it here:
http://www.connectedaction.net/

Basically it’s a tool for displaying and analysing network graphs. That sounds a lot more complicated than it actually is – really all it means is that you can use it for analysing the relationships between things. For example, if you had a list of people who were registered on a social networking site like Facebook, you could use NodeXL to display which people were friends with with other people; in a business setting you could use it to display which products were bought together in the same ‘basket’. Although it’s somewhat buggy and crashed on me a few times, it’s a lot of fun to use and I can see that there are a lot of potential uses in the BI space, especially now that Excel is being pushed as Microsoft’s BI client tool of choice.

So anyway, after I downloaded it I had a think about what data I could with it. There are built-in options to import data from Outlook (to analyse which email contacts appear together on the To and CC lines of emails) and social networking sites like Twitter, Flickr and YouTube. But I wanted something a bit more fun – and then it suddenly occurred to me, why not use it to analyse voting patterns in the Eurovision Song Contest? Apologies to readers outside Europe who don’t know what this is: basically it’s an annual competition where each country in Europe enters the worst pop song they can possibly come up with, and the winner is chosen by a vote; what everyone knows, of course, is that countries vote for the other countries they are most friendly to rather than on the basis of the songs themselves, for example with Greece and Cyprus always voting for each other (see here and here for some examples of detailed analyses of voting patterns).

I downloaded the raw data for voting in the 2009 competition from here, and with a bit of hacking got it into the template that’s bundled with NodeXL. Here’s an example of the output, using the impressively-named Fruchterman-Reingold layout, with the UK highlighted to show who voted for the UK and who the UK voted for:

image 

I’ll admit this particular graph is a bit busy, but in NodeXL itself you can zoom in and out, filter and analyse the relationships very easily; even here, though, we can see here things like the fact that the UK voted for Switzerland but Switzerland didn’t vote for the UK. Here’s a circular layout with the winners, Norway, highlighted:

image

Last of all, here’s the data filtered to show only Greece, with the points that Greece awarded to each other country shown on the vertices:

image

There are a lot of different options and this is a very complex product, so the fact it’s free is particularly amazing. It’s this kind of thing that makes Excel such a rich platform for data analysis – imagine using this with PowerPivot or the data mining addin, or other open-source tools like Sparklines for Excel.

Written by Chris Webb

February 22, 2010 at 2:23 pm

Posted in Excel

Connecting to SQL Azure from Excel 2007

with 8 comments

Sometimes I have an idea, spend a bit of time testing it out, and it ends up being a dead end. That’s what happened this evening but there’s at least one interesting bit of info that resulted so I thought I’d blog anyway…

My original thought was this:
* Excel 2007 can connect to SQL Server to retrieve data for use in reports in a worksheet
* SQL Azure is SQL Server in the cloud
* Office Web Apps gives us the ability to view Excel spreadsheets in the cloud
* So can I create an Excel spreadsheet that connects to SQL Azure, build a report using data from there, upload it to the Excel Web App and then refresh the connection so that my spreadsheet in the cloud displays live data from the cloud?

The short answer is no, at least not at the moment. But after a bit of trial-and-error I did get halfway there and manage to hook Excel 2007 up to SQL Azure (nb I’m not talking about using PowerPivot, which does work with SQL Azure, but the built-in Excel functionality). Here’s how:

  • None of the built-in functionality for connecting to SQL Server from Excel 2007 seems to work. However it is possible to connect to ODBC data sources from Excel and SQL Azure can be exposed as an ODBC data source.
  • So, in Excel, go to the Data tab and click on the “From Other Data Sources” and “From Data Connection Wizard”.
  • Select “Other/Advanced” and then the Microsoft OLE DB Provider for ODBC Data Sources and then click Next
  • Select the ‘Use Connection String’ option and paste the ODBC connection string that SQL Azure generates for you into the box.
  • Paste the value of the Uid property in the User name box, and put your password in the Password box. Delete the Uid and Pwd properties from the connection string.
  • Type the name of your database into the Initial Catalog box, then click OK
  • Finally a list of tables in your SQL Azure database appears; choose one, create an Excel data source and then create either a table or pivot table from the data. Click OK, enter your password one last time, and bingo!

Unfortunately, as I said, when I uploaded the resulting spreadsheet to the Excel Web App, I got the following error message:

image

Shame – I can understand why it makes sense for most external data connections not to be supported, but in this case, when the external data you’re connecting to is also in the cloud, it would be nice if an exception could be made.

One day, though, I’m sure a scenario like this will work. When I think about what Microsoft’s story for cloud BI might be like, the Excel Web App is the obvious candidate for the reporting tool. Whether you’re reporting direct from relational data stored in SQL Azure, or from some kind of cube (PowerPivot in the cloud is another obvious direction), Excel is going to be the easiest way to do it for the largest number of people. I do see a role for some kind of SSRS in the cloud too, but even in the Microsoft BI stack at the moment there’s a lot of overlap between SSRS and Excel/Excel Services for reporting; I wonder if this will be rationalised at some point? For example Report Builder has never really caught on as a way of letting end-users build their own reports, so why not forget it, develop Excel for this purpose and somehow extend SSRS’s rich functionality for managing and scheduling reports to work with Excel-based reports? Just a thought.

Written by Chris Webb

January 27, 2010 at 11:14 pm

Posted in Excel

What’s new for Analysis Services users in Excel 2010?

with 7 comments

I downloaded the Technical Preview for Office 2010 a few days ago, and was intending to blog about it as soon as possible but a few things made me wait a bit. First of all, there was the problem of whether I was in fact allowed to blog about it at all (which a lot of other people also seem confused about – which explains the strange silence on the web about it, perhaps); then there was the problem of actually finding what the new functionality was, since the Office team had neglected to provide any kind of detailed list of what has been added. Hmm. I think they’re planning to blog about everything that’s new soon though.

But anyway, now I’ve been given the go-ahead to blog I thought I’d list all of the new features I’ve found that are relevant to Analysis Services users. This does not include anything to do with Gemini, because Gemini isn’t part of the Technical Preview and I don’t have it yet unfortunately. I’m also not going to comment on bugs or things that don’t work in the way I’d want because, after all, this is not released software and things can and hopefully will change before RTM.

  • Slicers. If you’ve seen any of the Excel 2010 or Gemini demos you’ve probably seen that pivot tables can now have large slicer windows which make it a lot easier to select members (and see what has been selected) on the slice axis of a query. In terms of query functionality, as far as I can see they do exactly the same thing as the old single-cell dropdown-box-based slicers, created when you dragged a hierarchy into the Report Filter box, do. You can control their position, size and shape and almost every other aspect of their appearance.
    2010slicer   
  • Named sets. Ideally I’d have liked to be able to specify any MDX query I wanted for use in a pivot table, but this is the next best thing: it allows you to create your own named sets either using a simple UI or by (hooray!) entering your own MDX set expression; you can then use these sets wherever you want in your pivot table, for example on the rows or columns axis. This then means you can set up much more advanced selections than you ever could before, and is going to be incredibly useful for BI developers when creating dashboards in Excel. It might also allow for integration between Excel and other SSAS client tools.
    Here’s the simple set UI:
    SimpleSet
    Here’s the MDX set UI:
    MDXSets
  • Show as. A lot more calculation options have been added to the ‘Show as’ feature in pivot tables; Thomas Ivarsson has already blogged about this here
  • Writeback support. According to the Excel blog we’ll get support for writeback in Excel, at last. Not tested this yet though.  
  • Dynamic named sets. Again, according to the Excel blog dynamic named sets will work with 2010 (they didn’t with Excel 2007).
  • Search. There’s now a Search feature that allows you to find specific members quickly (although again I’ve not tested this on a really big hierarchy) in the slice dialog. It’s not available in the Slicers mentioned above yet, but I assume it will be.
  • Sparklines. Again, if you’ve seen any of the Excel 2010 demos you’ll have seen that at last they’ve implemented sparklines. This isn’t of course an SSAS-specific feature but anyone who’s creating BI dashboards in Excel will want to use them. Here’s a screenshot of what they look like:
    2010sparklines

On a related note, I see Panorama have put out a press release saying how closely they’ll be working with MS, Office 2010 and SQL 2008 R2. It doesn’t say anything meaningful directly, but it’s nice to see that MS and Panorama are friends again and that MS sees value in what Panorama have to offer beyond the pure-MS BI stack; I wonder if MS are coming to regret what they did to Proclarity?

Written by Chris Webb

July 17, 2009 at 12:14 pm

Posted in Excel

Excel 2007 Web Data Addin, and some thoughts on SSRS and Excel

with 2 comments

Following on from my blog entry on Kapow the other week (and Jamie’s post on the same subject), I’ve just stumbled on something called the Excel 2007 Web Data Addin, something that Microsoft Research put together. It’s basically an addin that improves on Excel’s built-in functionality for importing data from web pages; it falls a long, long way short of what can be done with Kapow – and it’s a bit buggy – but it’s still interesting. There’s not much to it, or indeed much information out there about it, but here’s the link to download it and a pair of blog entries announcing its release from 2007:
http://research.microsoft.com/en-us/downloads/db5286b6-0bb0-4668-9ebc-c3e9b43a0683/default.aspx
http://blogs.msdn.com/xaw/
http://blogs.msdn.com/excel/archive/2007/10/16/excel-2007-web-data-add-in.aspx

There’s also a short demo video:

I wonder if this kind of functionality will be built into Excel 2010? It would be useful from a Gemini point of view if it was.

Another thought I had when looking at this was that SSRS reports would be the obvious source of data for this kind of functionality (although the addin refuses to work with Report Viewer, I guess it would work if the report was addressed directly via its URL). It should be pretty straightforward to suck data out of an HTML report with a tool like this, and indeed we’ve been told that SSRS 2008 R2 will be able to expose report data as a feed, but thinking about this it struck me that that’s not how I’d really want to work with SSRS data at all.

Rather than Excel linking to a table in an SSRS report, or SSRS rendering a report to Excel, or even what the OfficeWriter functionality we may get in Excel 2010 does, what I’d really want is an Excel addin that works in the same way as Report Viewer: as a sophisticated client to SSRS, pulling data into a worksheet rather than having data pushed to it. I’d want to be able to connect to a report from a worksheet, then be able to enter parameters from within the worksheet (either using dropdown boxes, like filters in a pivot table, or by entering values directly into cells, with me being able to choose which cells held the parameter values), and then when I clicked Refresh have the data from the report brought straight into one or more Excel tables or graphs, with pagination working too. All in all it would work in a similar way to the Excel Cube functions and make it much easier to build applications in Excel based on SSRS data. SSRS would be reduced to the role of running queries, handling some calculations, and doing caching; the actual layout of the report would be controlled from within Excel. Maybe it’s something that could be built by the community, or by MS as a sample app? It wouldn’t be much work to develop, I think, although the problem would be that you’d want to be able to use the RPL rendering format that Report Viewer users (mentioned by Robert Bruckner here) and that’s not publicly documented.

 

Written by Chris Webb

July 13, 2009 at 11:08 am

Posted in Excel

Follow

Get every new post delivered to your Inbox.

Join 703 other followers