Archive for the ‘BI’ Category
Let me start this post by saying that I am a long way away from being a Visio expert – I’ve used Visio, of course, to create diagrams and I’ve also played with its BI capabilities in the past, but nothing more than that. A recent post on the Visio team blog reminded me about Visio’s BI capabilities and Jen Underwood then mentioned that Visio 2013 has some new functionality for BI, so I thought I’d check it out in more detail and blog about what I found. I’ve never seen Visio used to build dashboards or reports in the real world, but a quick search shows that the Visio pros out there have been doing this for years, so maybe it’s time us BI folks learned a few tricks from them? Visio 2013 is by no means a perfect tool for BI but I was pleasantly surprised at what it can do: you can create data-linked diagrams/dashboards in Visio on the desktop very easily, and then publish them to Visio Services in Sharepoint where they can be viewed in the browser and where the data can be refreshed.
First of all, the PowerPoint deck here is a good place to start to learn about Visio and Visio Services 2013 dashboards, as is the Visio team blog and Chris Hopkins’ blog. There’s also a walkthrough of how to link data to shapes here, and a lot of other good posts out there on creating charts and graphs in Visio such as this one.
Here’s what I did. To start, I created a few tables with data in in Excel to act as a data source, then published the workbook up to Excel Services in Sharepoint Online (I have an Office 365 E4 subscription). The data looked like this:
I then opened up Visio 2013, connected to the workbook in Excel Services and imported the data from these two tables. With that done, I was able to select a shape, and then drag a row of data from the External Data pane onto the sheet, which gave me a data-linked shape. It was then fairly easy to configure the data graphic associated with each shape – for example, in the diagram below, I selected a City shape, then dragged the row containing sales for London onto the sheet, which gave me a City with the data for London linked to it, and next to the City shape I had an associated data graphic which I configured as a Data Bar of type Multi-bar Graph.
The text next to the frowning face is also linked to data from Excel. I could then publish this to Sharepoint Online, and view the diagram in the browser just by opening it from the document library:
All very straightforward. In Visio Services you can add comments, and also refresh the data. Data can be refreshed manually or on a schedule; I used Excel Services data in this demo because in Office 365 only Excel Services and Sharepoint List data sources can be refreshed but the story is much better in on-prem Sharepoint (the PowerPoint deck I linked to above has all the details). Weirdly, I found that if I modified my Excel data source in the Excel Web App it took a few minutes for the new data to come through in Visio even with me clicking Refresh, although it did eventually.
Obviously this is a very basic (and badly designed) dashboard that works within the limitations of Visio and Visio Services, and if you want to learn how to do this properly I suggest you check out the links above. But there are two important questions that now need to be answered:
Why, as a BI pro, would I want to create a dashboard in Visio rather than, say, Excel or Power View?
I suspect Visio isn’t used more widely in MS BI circles is 20% down to ignorance of what it can do, 30% the cost of licensing and the Sharepoint dependency, and 50% the fact that there are only a limited number of scenarios where it is the right tool to use. So when would you actually want to use it? The risk of using Visio is that you end up with a visually appealing infographic that is actually very bad at conveying the information you want to convey, the kind of thing Stephen Few is complaining about here. You’d probably only want to use it if the nature of the diagram contributed to your understanding of the data. For example if you wanted to look at which seats were filled more frequently than others in a theatre or an aeroplane, it might be useful to have a diagram showing the seat layout and colour the seats that get filled. I guess these scenarios are very similar to the kind of scenarios where it makes sense to plot geographical data on a map.
What functionality is Visio missing for it to be a serious BI tool?
Quite a lot. Leaving aside PivotDiagrams, there is no proper support for SSAS or PowerPivot for data linking and that’s a big problem in these days of self-service BI. I also don’t link the way you have to import data into Visio before it can be used: I’d want to be able to select the data I want using a PivotTable-like interface (generating MDX or DAX queries) and then bind it to shape, so that I could slice and filter my data inside Visio without having to keep on importing it; I imagine being very similar to Power View today, but where you could drag data-driven shapes onto a canvas instead of tables and graphs. Maybe Power View and Visio need to get together and have children?
I don’t want to finish this post on a critical note, though, because I’ve had a lot of fun learning more about Visio and its BI capabilities, and I hope to use it on a project soon. Now that Sharepoint and especially Office 365 are being pushed so heavily for BI (and being used more widely), maybe we’ll see a lot more of Visio dashboards?
You may have already seen the announcement about Windows Azure Virtual Machines today; what isn’t immediately clear (thanks to Teo Lachev for the link) is that Analysis Services 2012 Multidimensional and Reporting Services are installed on the new SQL Server images. For more details, see:
SSAS 2012 Tabular is also supported but not initially installed.
Visual Studio LightSwitch has been on my list of Things To Check Out When I Have Time for a while now; my upcoming session on the uses of OData feeds for BI at the PASS BA Conference (which will be a lot more exciting than it sounds – lots of cool demos – please come!) has forced me to sit down and take a proper look at it. I have to say I’ve been very impressed with it. It makes it very, very easy for people with limited coding skills like me to create data-driven line-of-business applications, the kind that are traditionally built with Access. Check out Beth Massi’s excellent series of blog posts for a good introduction to how it works.
How does LightSwitch relate to self-service BI though? The key thing here is that aside from its application-building functionality, LightSwitch 2012 automatically publishes all the data you pull into it as OData feeds; it also allows you to create parameterisable queries on that data, which are also automatically published as OData. Moreover, you can publish a LightSwitch app that does only this – it has no UI, it just acts as an OData service.
This is important for self-service BI in two ways:
- First of all, when you’re a developer building an app and need to provide some kind of reporting functionality, letting your end users connect direct to the underlying database can cause all kinds of problems. For example, if you have application level security, this will be bypassed if all reporting is done from the underlying database; it makes much more sense for the reporting data to come from the app itself, and LightSwitch of course does this out of the box with its OData feeds. I came across a great post by Paul van Bladel the other day that sums up these arguments much better than I ever could, so I suggest you check it out.
- Secondly, as a BI Pro setting up a self-service BI environment, you have to solve the problem of managing the supply of data to your end users. For example, you have a PowerPivot user that needs sales data aggregated to the day level, but only for the most recent week, plus a few other dimension tables to with it, but who can’t write the necessary SQL themselves. You could write the SQL for them but once that SQL is embedded in PowerPivot it becomes very difficult to maintain – you would want to keep as much of the complexity out of PowerPivot as possible. You could set up something in the source database – maybe a series of views – that acts as a data supply layer for your end users. But what if you don’t have sufficient permissions on the source database to go in and create the objects you need? What if your source data isn’t actually in a database, but consists of other data feeds (not very likely today, I concede, but it might be in the future)? What if you’re leaving the project and need to set up a data supply layer that can be administered by some only-slightly-more-technical-than-the-rest power user? LightSwitch has an important role to play here too I think: it makes it very extremely easy to create feeds for specific reporting scenarios, and to apply security to those feeds, without any specialist database, .NET coding or SQL knowledge.
These are just thoughts at this stage – as I said, I’m going to do some demos of this in my session at the PASS BA Conference, and I’ll turn these demos into blog posts after that. I haven’t used LightSwitch as a data provisioning layer in the real world, and if I ever do I’m sure that will spur me into writing about it too. In the meantime, I’d be interested in hearing your feedback on this…
By now you’re probably aware that Office 2013 is in the process of being officially released, and that Office 365 is a very hot topic. You’ve probably also read lots of blog posts by me and other writers talking about the cool new BI functionality in Office 2013 and Office 365. But which editions of Office 2013 and Office 365 include the BI functionality, and how does Office 365 match up to plain old non-subscription Office 2013 for BI? It’s surprisingly hard to find out the answers…
For regular, non-subscription, Office 2013 on the desktop you need Office Professional Plus to use the PowerPivot addin or to use Power View in Excel. However there’s an important distinction to make: the xVelocity engine is now natively integrated into Excel 2013, and this functionality is called the Excel Data Model and is available in all desktop editions of Excel. You only need the PowerPivot addin, and therefore Professional Plus, if you want to use the PowerPivot Window to modify and extend your model (for example by adding calculated columns or KPIs). So even if you’re not using Professional Plus you can still do some quite impressive BI stuff with PivotTables etc. On the server, the only edition of Sharepoint 2013 that has any BI functionality is Enterprise Edition; there’s no BI functionality in Foundation or Standard Editions.
[For those of you thinking of upgrading from Excel 2010 PowerPivot to Office 2013, Marco has all the details on compatibility of PowerPivot workbooks across different versions here.]
Office RT, which runs on Windows RT, has several limitations on its BI functionality: there’s no PowerPivot, Power View or Excel Data Model. Luckily, Kasper has summarised what it does do in a blog post here, so I won’t repeat what he says.
Moving on to 2013 functionality in Office 365, and specifically BI in Sharepoint Online, things get more complicated. Although feature support information for Office 365 is on Technet here, the best place to start is Andrew Connell’s blog post and corresponding feature matrix that is viewable through (appropriately enough) the Excel Web App. The feature matrix makes it very easy to filter Office 365 features by workload so you only see the BI-related ones:
As you can see, the short answer is that you need either Office 365 E3 or E4, or SharePoint Online Plan 2 to get BI functionality. The Office Professional Plus, E3 and E4 plans are also the only plans to include subscriptions to the desktop versions of Office Professional Plus, and they allow it to be installed on up to 5 machines per user. The other thing you’ll notice is that PerformancePoint is not available at all in Office 365 (read into that what you will); it is of course available in Sharepoint 2013 Enterprise Edition on-premises.
There are other functionality differences between Sharepoint Online in Office 365 and on-premises Sharepoint too. The details are here, but the important ones are:
- At least for the moment, Excel workbooks can be no larger than 10MB
- The Excel Data Model will only refresh successfully if it sources data from the workbook itself; no external data sources are supported (though again I’d be surprised if that restriction isn’t lifted in the future)
- There is no PowerPivot for Sharepoint functionality such as the Gallery, usage reporting or scheduled data refresh.
These are quite significant restrictions, it’s true, but if you are a purely self-service BI shop and you just want to use Sharepoint Online to publish PivotTable or Power View reports that don’t need to be refreshed (or can be refreshed manually on the desktop and then uploaded) then this functionality should be sufficient. This is the kind of scenario I showed here, and I think a lot of customers with no existing BI will still be impressed with it; obviously it’s a problem if you want to do any kind of corporate BI.
BUT. At the time of writing the Enterprise plans for Office 365 haven’t been fully updated for Office 2013 functionality, so all this BI functionality isn’t actually available yet to most subscribers. This means that the desktop versions of Office you can download are still 2010 and not 2013; online, while you can get the latest Sharepoint features if you’re part of the Office 365 Preview, if you’re currently an Office 365 subscriber you’re probably still on Sharepoint 2010. The official line on when the upgrade to 2013 functionality will take place is a bit vague – it will take place “in the course of 2013” – and there seem to be a few upset customers out there (see here for example). February 27th seems to be a significant date.
Finally, apart from Office 365 it’s also possible to view Excel workbooks via SkyDrive. However pretty much no BI functionality is available when you do this: no Excel Data Model, no external connections, no Power View, just the ability to view (and not alter) PivotTables. These restrictions seem to be more or less the same if you use just the Office Web Apps server on-premises without Sharepoint 2013 – see the relevant table here for details.
In summary: my head hurts! All these editions and licences… it would be nice if it was less complicated.
UPDATE: some new information, and some clarifications, since I first wrote this post
1) Office Professional Plus 2013 will be available via Office 365 on February 27th 2013. The cheapest subscription option that includes Excel on the desktop with PowerPivot and Power View is, as far as I can see, this one, an Office Professional Plus subscription, that is included in the E3 and E4 plans.
2) Office Professional Plus is only available via Open, Select or EA licensing (see http://www.microsoft.com/en-gb/licensing/default.aspx for more details on what these options are). Excel 2013 standalone is only available via Open or Select. This means that no regular retail editions of Excel include PowerPivot or Power View, you can only get them through a Volume Licence Agreement or Office 365 (ie you need to be working for a big company with deep pockets unless you buy yourself an Office Professional Plus, E3 or E4 Office 365 subscription); compare this with PowerPivot for Excel 2010 worked with any edition of Excel. Existing PowerPivot users are not particularly happy about this when they find out: see here and here for example. Is this a good strategy? Hmm…
3) Right now, I’m told there is a problem with how the addins are packaged with Excel 2013 standalone which will be addressed in a future update.
UPDATE 2: I’ve just found out that standalone Power View is not supported at all in Sharepoint Online/Office 365. Only Power View sheets inside Excel workbooks are supported.
UPDATE 3: Power Pivot is now available in standalone versions of Excel too as of August 2013 - http://www.powerpivotblog.nl/power-pivot-and-power-view-now-available-in-excel-stand-alone
I was chatting to a friend of mine a few days ago, and the conversation turned to Microsoft’s bizarre decision to make two big BI-related announcements (about Mobile BI and GeoFlow) at the Sharepoint Partner Conference and not at PASS the week before. I’d been content to write this off as an anomaly but he put it to me that it was significant: he thought it was yet more evidence that Microsoft is abandoning ‘corporate’ BI and that it is shifting its focus to self-service BI, so that BI is positioned as a feature of Office and not of SQL Server.
My first response was that this was a ridiculous idea, and that there was no way Microsoft would do something so eye-poppingly, mind-bogglingly stupid as to abandon corporate BI – after all, there’s a massive, well-established partner and customer community based around these tools. I personally don’t think it would ever happen and I don’t see any evidence of it happening. My friend then reminded me that the Proclarity acquisition was a great example of Microsoft making an eye-poppingly, mind-bogglingly stupid BI-related decision in the past and that it was perfectly capable of making another similar mistake in the future, especially when Office BI and SQL Server BI are fighting over territory. That forced me to come up with some better arguments about why Microsoft should not, and hopefully would not, ever abandon corporate, SQL Server BI in favour of an exclusively Office-BI approach. Some of these might seem blindingly obvious, and it might seem strange that I’m taking the time to even write them down, but conversations like this make me think that the time has come when corporate BI does need to justify its continued existence.
- From a purely technical point-of-view, while most BI Pros have been convinced that the kind of self-service BI that PowerPivot and Excel 2013 enables is important, it’s never going to be a complete replacement for corporate BI. PowerPivot might be useful in scenarios where power users want to build their own models but the vast majority of users, even very sophisticated users, are not interested in or capable of doing this. This is where BI Pros and SSAS are still needed: centralised models (whether built in SSAS Tabular or Multidimensional) give users the ability to run ad hoc queries and build their own reports without needing to know how to model the data they use.
- Even when self-service BI tools are used it’s widely accepted (even by Rob Collie) that you’ll only get good results if you have clean, well-modelled data – and that usually means some kind of data warehouse. Building a data warehouse is something that you need BI Pros for, and BI Pros need corporate BI tools like SSIS to do this. Self-service BI isn’t about power users working in isolation, it’s really about power users working more closely with BI Pros and sharing some of their workload.
- Despite all the excitement around data visualisation and self-service, the majority of BI work is still about running scheduled, web-based or printed reports and sending them out to a large user base who don’t have the time or know-how to query an SSAS cube via a PivotTable, let alone build a PowerPivot model. Microsoft talks about bringing BI to the masses – well, this is what the masses want for their BI most of the time, however unsexy it might seem. This is of course what SSRS is great for and this is why SSRS is by far the most widely used of Microsoft’s corporate BI tools; you just can’t do the same things with Excel and Sharepoint yet.
- Apart from the technical arguments about why corporate BI tools are still important, there’s another reason why Microsoft needs BI Pros: we’re their sales force. One of the ways in which Microsoft is completely different from most other technology companies is that it doesn’t have a large sales force of its own, and instead relies on partners to do its selling and implementation for it. To a certain extent Microsoft software sells itself and gets implemented by internal IT departments, but in a lot of cases, especially with BI, it still needs to be actively ‘sold’ to customers. The BI Partner community have, for the last ten years or so, been making a very good living out of selling and implementing Microsoft’s corporate BI tools but I don’t think they could make a similar amount of money from purely self-service BI projects. This is because selling and installing Office in general and Sharepoint in particular is something that BI partners don’t always have expertise in (there’s a whole different partner community for that), and if self-service BI is all about letting the power users do everything themselves then where is the opportunity to sell lots of consultancy and SQL Server licenses? If partners can’t make money doing this from Microsoft software they might instead turn to other BI vendors; I’ve seen some evidence of this happening recently. And then there’ll be nobody to tell the Microsoft BI story to customers, however compelling it might be.
These are just a few of the possible reasons why corporate BI is still necessary; I know there are many others and I’d be interested to hear what you have to say on the matter by leaving a comment. As I said, I think it’s important to rehearse these arguments to counter the impression that some people clearly have about Microsoft’s direction.
To be clear, I’m not saying that it should be an either/or choice between self-service/Office BI and corporate/SQL Server BI, I’m saying that both are important and necessary and both should and will get an equal share of Microsoft’s attention. Neither am I saying that I think Microsoft is abandoning corporate BI – it isn’t, in my opinion. I’m on record as being very excited about the new developments in Office 2013 and self-service but that doesn’t mean I’m anti-corporate BI, far from it – corporate BI is where I make my living, and if SSAS died I very much doubt I could make a living from PowerPivot or Excel instead. Probably the main reason I’m excited about Office 2013 is that it finally seems like we have a front-end story that’s as good as our back-end, corporate BI story, and the front-end has been the main weakness of Microsoft BI for much too long. If Microsoft went too far in the direction of self-service we would end up with the opposite problem: a great front-end and neglected corporate BI tools. I’m sure that won’t be the case though.
The call for speakers for the new PASS Business Analytics Conference (to be held April 10-12 next year in Chicago) is now live here:
Since I think this conference is a Very Good Thing, and because I’ve been asked to help shape the agenda in an advisory capacity, I thought I’d do a little bit of promotion for it here.
The important thing I’d like to point out is that this is not just a SQL Server BI conference: it covers the whole SQL Server BI stack, certainly, but really it aims to cover any Microsoft technology that can be used for any kind of business analytics. Which other technologies actually get covered depends a lot of who submits sessions but there are no end of possibilities if you think about it. I’d love to see sessions on topics such as F#, Cloud Numerics, Sharepoint, NodeXL, GeoFlow and especially non-BI Excel topics such as array formulas, Solver and techniques like Monte Carlo simulation, for example.
This brings me to the point of this post. Obviously I’d like all the SQL Server BI Pros out there who read my blog to consider submitting a session (or if you can’t travel to Chicago, the call for speakers for SQLBits is open too) and to attend. However what I’d really like is if the SQL Server BI community could reach out to the wider Microsoft Business Analytics community to encourage them to submit sessions and to attend too. This is where your help is needed! Who do you think should be speaking at the PASS BA Conference? Do you know experts outside the realms of SQL Server BI who you could persuade to come? What topics do you think should be covered? If you’ve got any ideas or feedback, please leave a comment…
It didn’t get much attention at the time (maybe because it was done at the Sharepoint Conference, and not at PASS… why?) but last week Microsoft gave the first public demos of its Mobile BI solution. I wasn’t there to see it but Just Blindbaek was and this morning he tweeted some pictures of what he saw. Some of you Microsoft BI enthusiasts might be interested to see them:
The codename seems to be ‘Project Helix’.
Normally I’d rush to blog about the announcements made in the keynotes each day at the PASS Summit, but this year I had a session to deliver immediately afterwards and once I’d done that I saw Marco had beaten me to it! So, if you want the details on what was announced in today’s keynote I’d advise you to read his post here:
I can’t not comment on some of these announcements though, so here (in no particular order) are some things that occurred to me:
- The first public sighting of Power View on Multidimensional raised the biggest cheer of the morning, which surprised even me – I didn’t realise there were so many SSAS fans in the audience. I’m certainly very pleased to see it, even if it isn’t shipping right now (it’s not in SP1 either). Part of why I’m pleased is that all too often Microsoft BI has been good at building amazing new products but then forgetting about the migration path for its existing customers: think of the Proclarity debacle, and more recently I’ve heard a lot of complaints about the abandonment of Report Models. I suspect this is because Microsoft is not like most other software companies in that it doesn’t do much direct selling itself, but lets partners do the selling for it, and when partners get stick from customers over issues like Proclarity migration then the partners have no leverage over Microsoft to make it deal with the problem. Power View on Multidimensional is a welcome exception to this pattern, and I’d like to see more consideration given to this issue in the future even if it comes at the expense of developing cool new features.
- The PDW V2 news is interesting too. It was clearly stated that Polybase will, initially allow TSQL to query data in Hadoop but that other data sources might be supported in the future. I wonder what they will be? DAX/Tabular perhaps? Or something more exotic – wouldn’t it be cool if you could query the Facebook graph or Twitter or even Bing directly from TSQL? I’m probably letting my imagination run away with me now…
- The other thing that popped into my mind when hearing about Polybase was that it might be possible, one day, to use SSAS Tabular in DirectQuery mode on top of PDW/Polybase to query data in Hadoop interactively. I know Hadoop isn’t really designed for the kind of response times that SSAS users expect but I’d still like to try it.
- It hardly seems worth repeating the fact that Mobile BI is very, very late but again it was good to get some details on what is coming. As partners we can deal with the criticism we get from customers and plan better if we have some idea of what will be delivered and the timescales involved, something that has been conspicuously lacking with Mobile BI up to today. To use a current phrase, Microsoft and its partners are “all in this together”, so please, Microsoft, let us help you!
You may have seen the news late last week that Office 2013 has RTMed, which in itself isn’t that significant – it’s not going to be until mid-November that the likes of you or I can download it. But it’s a milestone and therefore a good time to think about what Office 2013 means for Microsoft BI as a whole.
Let me start by saying that I’ve spent a lot of time playing with Office 2013, especially Excel 2013, over the last few months and I’ve been very impressed with it. I think it’s a great product and also that it represents a significant turning point for Microsoft BI. I won’t summarise everything I’ve said in previous blog posts about new functionality (you can read those yourself!), but here are what I consider some of the important points to consider when assessing its impact:
- Number 1 on the list of new features for BI has to be the way PowerPivot has been integrated into Excel. Indeed, although PowerPivot still exists as a separate addin, I’m not sure it’s particularly helpful to think of PowerPivot and DAX as something distinct from Excel any more – we should think of them as the native Excel functionality that they are. Maybe we shouldn’t even use the names PowerPivot and DAX at all any more? And of course, now that users will get it by default, it will open the way to much, much wider adoption. I’m working on a PowerPivot/Excel 2010 project at the moment where the customer’s desktops are locked down and it took several weeks to get PowerPivot installed on even a few desktops; with Excel 2013 those problems won’t occur.
- The integration of Power View into Excel comes a close second in terms of significant new functionality. Like a lot of people I was impressed by the technology when I saw first saw the Power View in Sharepoint last year, but frankly the Sharepoint dependency meant none of my customers were even vaguely interested in using it and I thought it was stillborn. Putting Power View into Excel changes all this – it’s effectively giving it away to all corporate customers and, as with PowerPivot, this will remove a lot of barriers to adoption. It might not be as good at data visualisation as something like Tableau, but it doesn’t need to be – you’re going to get it anyway, it will do most of what these other tools do, so why bother looking at anything else?
- The way PivotTables and Power View reports now work so well in the browser with Excel Services and the Excel Web app means that Excel should now be considered the premier web reporting and dashboarding solution in the Microsoft BI stack, and not just as something for the desktop. I’ve never been fond of PerformancePoint (and again I never saw significant uptake amongst my customers – indeed, over the years, I’ve seen it used only very rarely) and I see less and less reason to use it now when Power View does something similar. SSRS still has its own niche but even it will start to decline slowly because it will be so much easier for BI pros and end-users to build reports in Excel. This in turn will make the whole Microsoft BI stack much more comprehensible to customers and a much easier sell – Excel will be the answer to every question about reporting, data analysis, data visualisation and dashboards.
- Office 365 will help overcome the problems customers have with the Sharepoint dependency in the Microsoft BI stack. I discussed this problem at length here; having now used Office 365 on the Office Preview myself, I’m a convert to it. I’ve had Sharepoint installed on various VMs for years but it’s only now with Office 365 and freedom from the pain of installation and maintenance that I can start to appreciate the benefits of Sharepoint. For small companies it’s the only way Sharepoint can be feasible. More important than anything else, though, is the subscription pricing that has just been announced: Office 365 is a no-brainer from a cost point of view. I saw recently that Toyota Motor Sales in the US have just decided to go to Office 365 and I wouldn’t be surprised if other, larger enterprises to do the same; this isn’t just something for SMEs.
- The ability to stream Excel 2013 to desktops means that yet more barriers to deployment will be removed.
- We’re still waiting for Microsoft’s mobile BI solution, of course. I hope it’s coming soon! Whatever form it takes, I would expect it to be very closely linked to Office 2013.
What do you think, though? I’m interested in hearing your comments – have I drunk too much Microsoft Kool-Aid?
I read an interesting article by Stephen Swoyer today on the TWDI site today, about a new Gartner report that suggests that companies should start selling the data they collect for BI purposes to third parties via public data marketplaces. This is a subject I’ve seen discussed a few times over the last year or so – indeed, I remember at the PASS Summit last year I overheard a member of the Windows Azure Marketplace dev team make a similar suggestion – and I couldn’t resist the opportunity to weigh in with my own thoughts on the matter.
The main problem that I had with the article is that it didn’t explore any of the reasons why companies would not want to sell the data they’re collecting in a public data marketplace. Obviously there are a lot of hurdles to overcome before you could sell any data: you’d need to make sure you weren’t selling your data to your competitors, for example; you’d need to make sure you weren’t breaking any data privacy laws with regard to your customers; and of course it would have to be financially worth your while to spend time building and maintaining the systems to extract the data and upload it to the marketplace – you’d need to be sure someone would actually want to buy the data you’re collecting at a reasonable price. Doing all of this would take a lot of time and effort. The main hurdle though, I think, would be disinterest: why would a company whose primary business is something else start up a side-line selling its internal data? It has better things to be spending its time doing, like focusing on its core business. If you sell cars or operate toll roads why are you going to branch out into selling data, especially when the revenue you’ll get from doing this is going to be relatively trivial in comparison?
What’s more, I think it’s a typical piece of tech utopianism to think that data will sell itself if you just dump it on a public data marketplace. Maybe apps on the Apple App Store can be sold in this way, but just about everything else in the world, whether it’s sold on the internet or face-to-face, needs to be actively marketed and this is something that the data generators themselves are not going to want to make the effort to do. As I said earlier, those companies that are interested in selling their data will still need to be careful about who they sell to, and the number of potential buyers for their particular data is in any case going to be limited. Someone needs to think about what the data can be used for, target potential customers and then show these potential customers how the data can be used to improve their bottom line.
For example, imagine if all the hotels around the Washington State Convention Centre were to aggregate and sell information on their bookings for the next six months into the future to all the nearby retailers and restaurants, so it was possible for them to predict when the centre of Seattle would be full of wealthy IT geeks in town for a Microsoft conference and therefore plan staffing and purchasing decisions appropriately. In these cases a middle man would be required to seek out the potential buyer and broker the deal. The guy that owns the restaurant by the convention centre isn’t going to know about this data unless someone tells him it’s available and convinces him it will be useful. And just handing over the data it isn’t really good enough either – it needs to be used effectively to prove its value, and the only companies who’ll be able to use this data effectively will be the ones who’ll be able to integrate it with their existing BI systems, even if that BI system is the Excel spreadsheet that the small restaurant uses to plan its purchases over the next few weeks. Which of course may well require outside consultancy… and when you’ve got to this point, you’re basically doing all of the same things that most existing companies in the market research/corporate data provider space do today, albeit on a much smaller scale.
I don’t want to seem too negative about the idea of companies selling their data, though. I know, as a BI consultant, that there is an immense amount of interesting data now being collected that has real value to companies other than the ones that have collected it. Rather than companies selling their own data, however, what I think we will see instead is an expansion in the number of intermediary companies who sell data (most of which will be very small), and much greater diversity in the types of data that they sell. Maybe this is an interesting opportunity for BI consultancies to diversify into – after all, we’re the ones who know which companies have good quality data, and who are already building the BI systems to move it around. Do public data marketplaces still have a role to play? I think they do, but they will end up being a single storefront for these small, new data providers to sell data in the same way that eBay and Amazon Marketplace act as a single storefront for much smaller companies to sell second-hand books and Dr Who memorabilia. It’s going to be a few years before this ecosystem of boutique data providers establishes itself though, and I suspect that the current crop of public data marketplaces will have died off before this happens.