Archive for the ‘BI’ Category
Jamie Thomson and I share a number of… obscure enthusiasms. For instance, last week when he spotted the new forms/surveys feature in the Excel 2013 Web App (see here for a mention) he knew I’d be excited. And I was. Excited enough to devote a whole blog post to them.
What is this feature? Basically a rip-off of homage to the Google docs functionality I mentioned here that allows you to create simple questionnaires and save the data back to a spreadsheet. To use it you need to create a new Excel spreadsheet in the Excel Web App (I can’t seem to find it in desktop Excel and it may not even exist there), then click on Form/New Form in the ribbon:
This opens a new dialog where you can create your form/survey:
It’s all pretty self-explanatory from there, you just enter a title and description and then some questions, which can be various types (returning text, numbers, multiple choices etc):
You can then answer the questions yourself or send a link out to other people so they can too:
If you’d like to take the survey you can do so here btw.
The data then lands in a table in the original Excel spreadsheet, ready for you to do something useful with it:
For my next trick, and to go back to another issue that Jamie and I have been moaning about for years, I would have liked to consume the data in this table via an OData feed as detailed here:
Unfortunately I couldn’t get it to work. Whether this is a temporary problem or a limitation with Office 365 (as opposed to on-prem Sharepoint) I don’t know… if someone knows how to make it work, though, I’d be much obliged if you could leave a comment.
UPDATE: First of all, if you can’t see the survey don’t worry – the service seems to be very unreliable. Secondly I’ve got the OData feed working now and will blog about it later.
As you may already have read, the first public preview for Office 2013 is now available and there’s lots of new BI functionality in there – see here for an overview. Here’s a quick summary of what the two really important changes are:
- PowerPivot has been integrated into Excel, kind of. This means that the xVelocity (aka Vertipaq) engine is now native to Excel, and you can do all the basic PowerPivot stuff like loading vast amounts of data from multiple data sources and querying it via PivotTables directly in Excel, without installing any extra addins. PowerPivot does still exist as an optional extra however: you need it if you want to use the more advanced functionality that exists in PowerPivot today, such as filtering data before import, using diagram view, defining hierarchies and perspectives and so on.
- Power View has also been integrated into Excel: Power View reports become a new type of sheet inside a workbook, and you can use it against data held in the integrated xVelocity/PowerPivot engine; I’m not clear yet whether it will work on a SSAS 2012 Tabular model (and at some point a SSAS Multidimensional model, once support for DAX on Multidimensional models arrives) but I hope it does. No more need to moan about Power View being tied to Sharepoint!
There are a whole bunch of other BI-related changes in Excel which I’ll try to summarise in another post soon (stuff like the suggestions for charts and PivotTables, flash fill, timeline slicer). However I think that the two changes above represent a master-stroke on the part of Microsoft: they make Excel 2013 a serious contender in the self-service BI tool stakes. Certainly, other vendors will be quick to point out the features they have and that Excel doesn’t, and dedicated BI vendors will always be able to add new features faster and more frequently than Excel, but that’s not the point. It won’t happen overnight but at some point every company will upgrade to Office 2013 and when they do, all users will have a BI tool on their desktops which is vastly more capable than Excel today and will be good enough for the majority of BI scenarios – which means that the need to even look at third party tools will disappear.
UPDATE A few clarifications:
- The PowerPivot addin, while still an addin, comes bundled with Excel – there’s no separate download
- As the comments below show, and I can confirm, Power View does work with SSAS 2012 Tabular models
I’ll be posting more details throughout the day on Twitter as I play with the new preview, and will post something more substantial here later
In my last post I showed how to load data from Excel into Layerscape, the new data visualisation tool from Microsoft Research; in the post before that I showed how to load UK weather data from Windows Azure Datamarket into PowerPivot. Now let’s take that weather data and plot it on a map!
When doing this, the first decision I made with this data was about what to show. The main problem is that the data volumes involved here are very close to the upper limit of what Layerscape is able to handle: in a few cases I crashed Layerscape, but that was when I was trying to load around 150000 rows of data into it; I found that just over 110000 rows of data was ok however. As a result I made the decision to only show data for cloud or rain, not sun (which is ok – if you want to visualise a clear day, you don’t want to show anything on a map I think) or mist or fog (which I was less happy about).
To achieve this I created a calculated column on my Three Hourly Forecast table in PowerPivot called Cloud with the following definition:
=if([SignificantWeatherId]>=7, TRUE, BLANK())
Then, after having created a flattened PivotTable with the data I wanted to display, I dropped the Cloud field into the Values box in my PivotTable and was able to filter it to only show Weather Stations and time periods where there was cloud:
I also created a few other calculated columns:
- ActualPredictionForStart and ActualPredictionForEnd: the former is mentioned in my previous post, and the latter returns a value three hours after the former:
=[Date] + (([TimeStep]+3)/24)
These two values represent the start time and the end time for each row in the Three Hourly Forecast table.
- WeatherDepth: in the Significant Weather table there’s a distinction made between low-altitude cloud and high altitude cloud, and in LayerScape when you’re plotting data you can control how high off the ground a point is displayed, so the WeatherDepth column contains some fairly arbitrary numbers for cloud altitudes based on the [Code] column. Here’s the definition:
=SWITCH([Code], 5, 150, 6, 150, 7, 100, 8, 120, 10)
- WeatherColor: again, in Layerscape you can control the colour of your points and their opacity, which again is useful for displaying dark/light and thin/thick clouds. The only compromise I had to make was to display dark clouds (ie where clouds are described as either black or where the weather is rainy or snowy) in blue rather than black, because I couldn’t get Layerscape to display black – white clouds are shown in white. To calculate WeatherColor I created two intermediate calculated columns on the Significant Weather table, InitialColor (which contains colours for weather that I’m not displaying in this demo – I thought dust storms should be shown in yellow for instance):
=SWITCH([Code], 1, "yellow", 2, "white", 3, "yellow", 4, "yellow", 5, "white", 6, "white", 7, "white", 8, "blue", "blue")
=SWITCH([Code], -99, 0, 0, 0, 1,10, 2, 25, 3, 10, 4, 100, 5, 25, 6, 75, 7, 50, 8, 50, 9, 60, 10, 60, 11, 60, 12, 65, 13, 75, 14, 75, 15, 80, 90)
…making the definition of WeatherColor:
=[Opacity] & "% " & [InitialColor]
- Depth and Color: calculated columns on the Three Hourly Forecast table that copied the WeatherDepth and WeatherColor values down to the main fact table:
This screenshot gives you some idea of the values that the depth and colour calculations return:
With this all done I was able to load the data into Layerscape in the way I showed in my previous post, tweak some of the settings for the markers and time decay, and come up with a nice-looking visualisation. The big difference in this case compared to my previous examples is that here we have time series data and Layerscape is able to show values changing over time. The only thing I needed to do to make this happen was to check the Time Series box in the Layers pane on the main screen; with this done I could show the data for a particular point in time or let Layerscape cycle through time showing how the forecast weather changed.
I also created a video showing the tour:
I’m quite happy to admit that this tour doesn’t show off the full capabilities of Layerscape (I’m not sure my graphics card is up to the job, frankly) and I’ve not taken much time to ensure that the visualisation is as accurate as it could be, but I’m still quite proud if it!
I had an interesting chat with Gary Short at SQLBits last week about social network analysis, and he pointed out that Microsoft Research’s graph database, Project Trinity, was now available to download. You can find out more about it here:
Gary suggested that Trinity might, at some point in the future, follow F# in being a MS Research project that becomes a commercial project and to be honest I think he might be right. If Analysis Services is a database that is optimised for multidimensional data, why not add Trinity to the SQL Server stack as a specialised database for graph data? After all, social network analysis is only going to become more and more important in corporate BI in the future, and that’s just one use case for a graph database. And if Trinity becomes a product, why not include features of a tool like NodeXL in Project Crescent?
It was only a matter of time, but here’s what I think is the first use of Kinect for Microsoft BI:
OK so it’s very, very basic and only works on Dynamics Business Analyzer, but it shows what could be done. Maybe something to think about for a BI Power Hour presentation on Crescent? Although until the day comes that we all work in CSI-style labs with large screens to wave our arms around in front of, I seriously doubt it would have any practical use. Anyway, if you wanted to do something truly cool you’d not even bother moving at all and build your reports using the power of your mind – I saw Guy Smith-Ferrier do a very entertaining session called “Mind Control Your Computer in C#” recently, so the idea isn’t as far-fetched as it seems!
My post about Sharepoint and the MS BI strategy last month generated a lot of debate; an equally hot topic is that of Microsoft’s mobile BI strategy, or lack of it. It’s something I’ve heard a lot of people sounding off about recently which is why I found the this article interesting:
Apart from a diplomatic quote from Donald Farmer in his current guise as QlikView evangelist-in-chief, the article picks up on a thread from the Microsoft BI Facebook page and perhaps tries to read too much into it…
It’s been clear for a while now that Sharepoint is at the heart of Microsoft’s BI strategy. The first sign was the way PerformancePoint was touted as a replacement for Proclarity. Then came the news that if you wanted to share models between multiple users in PowerPivot, you needed Sharepoint 2010 Enterprise Edition. In Denali, if you want to use cool new stuff like Crescent or the new SSRS alerting functionality you need Sharepoint. But is this a good thing for Microsoft BI? Behind the scenes a lot of people have been debating this question for a long time, so I thought it was an appropriate subject for a blog post – and you know how I like controversial topics…!
Let me start by saying that I have I have an opinion on this but not one that I feel 100% sure of asserting: I have the nagging feeling that my own view of the market is too limited to know whether the Sharepoint strategy is good or not, so my mind isn’t completely made up (and in fact the more I think about this issue, the more unsure about my opinion I am). Also, I don’t think anyone has objections to the purely technical reasons for the Sharepoint strategy – after all, why should the various Microsoft BI teams be in the business of building portals when Microsoft has its own, extremely successful portal they can integrate with, which gives them a lot of rich functionality for free? The question is essentially a commercial one: will more customers buy Microsoft BI as a result of increased integration with Sharepoint (in turn leading to Microsoft and its partners making more money, which is all anyone really cares about), or will a Sharepoint dependency actually put customers off and drive them into the arms of Microsoft’s competitors?
The argument in favour of the Sharepoint strategy goes something like this:
- Microsoft’s BI products need portal functionality. Time and money for development of BI products is limited, so if the portal functionality can be got from Sharepoint then it can be delivered quicker, at a lower cost, and with time and money left over for other new functionality that would not be possible otherwise. More and better functionality means the customer is more likely to buy.
- Integrating with Sharepoint also gives the wider Microsoft BI offering a coherence it wouldn’t otherwise have (and something it has historically lacked), and the whole ends up being greater than its constituent parts. This lack of overlapping functionality looks good in front of the customer, and also increases the opportunity to cross-sell BI to existing Sharepoint customers and vice versa.
- Sharepoint is massively successful, one of Microsoft’s leading server products, so most customers you want to sell BI to will have Sharepoint anyway; therefore there will be little resistance to buying new tools that have a dependency on Sharepoint. The Sharepoint market is so large that even if only a small percentage of it is interested in or able to use MS BI, that’s still a massive potential market.
- Presumably, at some point there will be a fully-featured “Sharepoint in the cloud” with all the BI features baked in, which means that it will be even easier for companies to adopt it.
- Microsoft is well aware of the arguments against Sharepoint that are listed below, and because it wants the Sharepoint strategy to work it is taking action to address these problems of cost, complexity and uptake. One example is the increasing number of Microsoft BI appliances that are available, where all of the tough configuration decisions are made for you.
The argument against is this:
- Sharepoint is expensive (or at least perceived as expensive) in terms of license costs, infrastructure and administration, so it makes the overall MS BI solution more expensive to have a dependency on it.
- Sharepoint is a complex product (or at least perceived as complex), and Microsoft’s BI tools are pretty complex as well; integrating the two makes something even more complex. As a result, whereas in the past a single BI guy could just install SSAS, SSRS and so on on a server, now you need a BI guy and a Sharepoint guy to do all the setup and admin, which doubles the cost of labour; the added complexity also makes it more likely that the setup and admin will take longer. Microsoft BI products have traditionally seen a lot of their adoption come from internal IT departments taking the ‘it’s effectively free, so let’s install it somewhere and see what it does’ path, and this will become much less common because of the added overhead of Sharepoint.
- The added dependencies between Sharepoint and BI could actually make it slower to deliver new features because now there are multiple MS dev teams that need to work together, co-ordinate functionality and release cycles, and deal with conflicting priorities. History has shown that MS dev teams don’t always do this well (think of Excel and SSAS support), and even when they do some compromises are inevitable.
- Many customers do have Sharepoint, but not all of them have the editions or versions that the MS BI stack requires. And very often due to political divisions, an internal corporate Sharepoint team have their own agenda to follow which has no place for BI, and aren’t interested in upgrading to a certain version or otherwise accommodating the BI team when it might impact on their own goals.
- Some customers do not have Sharepoint and have made a conscious decision not to have it; these customers include not only the die-hard anything-but-Microsoft shops but also some who would be interested in a solution with fewer dependencies. For these customers, a Sharepoint dependency removes all question of the use of MS BI.
- The MS partner ecosystem, at least at the mid-level, is segregated into BI partners and Sharepoint partners, and while there’s a certain amount of convergence you still tend to find that many consulting companies are BI partners who do a bit of Sharepoint on the side or Sharepoint partners who do a bit of BI on the side, so not all of them are capable of selling or implementing an overarching BI-Sharepoint solution.
The nature of my work means that I get to see a lot of different Microsoft BI implementations, probably more than the average consultant. I reckon I work with around 30-40 different customers every year, ranging in size from one-man-bands to the biggest enterprises, and in the five or so years I’ve been in business I’ve only ever seen a relatively small number who actively use Sharepoint in combination with the rest of the Microsoft BI stack. If you work for a large partner that specialises in and actively sells Microsoft BI and Sharepoint you may have seen much greater use of Sharepoint than I have, and if you work for a specialist Sharepoint partner I dare say you only ever work with customers who are very committed to Sharepoint, so I’ll admit my point of view is biased. On the other hand I can’t deny the evidence of my own experience and as a result my natural inclination is to be slightly sceptical about the Sharepoint BI strategy, because I don’t see any basis for the claims that Sharepoint is a ubiquitous platform and one that users actively want to integrate with BI. I’d also add that a couple of years ago I was equally sceptical about Excel’s central role in the Microsoft’s SSAS and wider BI strategy, but now I see Excel used successfully on a wide range of SSAS projects and I’m very much more pro-Excel (although I’m not blind to Excel’s continuing shortcomings as an SSAS client tool for more advanced users). Maybe in a year or two’s time all my customers really will have Sharepoint, the Sharepoint strategy will bear fruit, and my fears will have been proved groundless.
So… what do you think about all this? What are your experiences with Sharepoint, do you have it in-house already or (if you’re a consultant) will you be able to sell a BI platform based on it? Please leave a comment…
So day one of the PASS Summit is drawing to a close, and what a day it’s been. I did a session myself today, which went well (I think), but the real news is all the cool new stuff that was announced at the keynote and at sessions throughout the day and that’s what I’d like to reflect on here. A lot of questions about what we’ve seen today remain unanswered (more will be revealed tomorrow, I’m told) but I thought I’d blog about what interested me and was relevant to BI.
All in all, I’m very excited. Let’s face it – over the last five years, in SSAS and in the wider BI stack, there’s been a distinct lack of anything really radical and new. But today saw several announcements that will completely change the Microsoft BI stack:
- There will be column-store indexes, provided by the Vertipaq engine, inside the SQL Server relational database. Simon Sabin has a link to more details here.
- There’s a new, corporate BI version of PowerPivot, BISM
- Project Crescent is a new ad hoc query tool coming from the Reporting Services team
Let’s consider what each of these means in turn. Based on the limited information we’ve got so far, column-store indexes in SQL Server massively increase the performance of reporting/OLAP style queries inside the relational engine; if we get column-store indexes in Parallel Data Warehouse, well, I can imagine we’ll get astounding performance over huge data volumes. The pdf linked to in Simon Sabin’s blog says:
Users who were using OLAP systems only to get fast query performance, but who prefer to use the T-
SQL language to write queries, may find they can have one less moving part in their environment,
reducing cost and complexity. Users who like the sophisticated reporting tools, dimensional modeling
capability, forecasting facilities, and decision-support specific query languages that OLAP tools offer can
continue to benefit from them. Moreover, they may now be able to use ROLAP against a columnstore-
indexed SQL Server data warehouse, and meet or exceed the performance they were used to in the past
with OLAP, but save time by eliminating the cube building process.
To paraphrase, if your data’s in SQL Server that’s where it should stay for querying and reporting – MOLAP no longer provides any performance benefit, so the concept of ‘processing’ a cube to get data into a different, OLAP database is gone. If you still want to do OLAP on SQL Server it will be a form of ROLAP, and to me this makes a lot of sense.
Very few details about BISM, the ‘corporate’ version of Powerpivot/Vertipaq, were released but we saw that we could develop BISM models in Visual Studio and the end result was exactly what PowerPivot creates when you deploy a PowerPivot mode to Sharepoint – I guess it’s basically a different type of Analysis Services database. BISM can either hold the data itself in its own Vertipaq store (useful when the data comes from anywhere other than SQL Server, eg Excel, Oracle, text files and so on) or it can act purely as a semantic layer and allow querying data in SQL Server in a ROLAP, multidimensional way. So we’ll be able to create an Excel pivot table, connect to BISM and use that as a thin layer to query data that’s directly in SQL Server. This is clearly what the future of Analysis Services is going to be – while the Analysis Services we know and love today might hang around for a few versions, it’s not got a long-term future in my opinion.
There was a full session on Project Crescent later on today, which I went to, so I have more details on this. It’s essentially a new ad hoc query tool being built by the SSRS team – interestingly it seems to have nothing to do with the rest of Reporting Services, and it doesn’t create rdl files that can be edited in Report Builder or Report Designer. It’s all about flashy visualisation and interactivity and reminds me a lot of Tableau and other fashionable BI tools; I can also see how it can be a replacement for PerformancePoint for creating dashboards. It is only surfaced in Sharepoint (boo! yet another Sharepoint dependency!) and is built in Silverlight; also, it can only work with data sourced from BISM/PowerPivot models. Once you get past the flashy stuff it does much the same that every other ad hoc query tool has been doing since the year dot (Marco, sitting next to me, commented that it was doing much the same thing that Data Analyzer was doing 10 years ago) but the flashy stuff is very flashy indeed, and very impressive – for instance the ability to export views out to PowerPoint slides looks cool; but I do wonder whether it will be as practically useful as something like Tableau so we’ll have to wait and see. I’m pleased to see that someone at MS has finally woken up to the fact that end users might want to use something other than Excel for ad hoc querying.
OK, time for a few drinks and to have some fun! There’ll be another post with more news tomorrow…
I’m in the middle of my summer holiday at the moment, hence the relative silence here, but I just noticed this article by Mary Jo Foley suggesting that Microsoft might be turning Dryad into a commercial product next year and couldn’t resist posting it:
There was a flurry of interest in it a few years ago but I thought all had gone quiet; maybe this is this the source of the rumours that Microsoft/Hadoop rumours too? I also notice that SQL Server/SSIS has disappeared from the ‘software stack’ diagram; is there going to be any tie-in with the rest of Microsoft’s BI efforts?
I first came across OData last year when Jamie Thomson blogged about it; since then I hadn’t really thought about it much until I came across Douglas Purdy’s blog and specifically his post from yesterday which really brought home how important it is to the future of Microsoft BI. I would urge you to watch the video “OData: The Movie” that he mentions in his post because it gives a really good introduction to the way that OData can be used in BI scenarios; if you don’t have the time, all you really need to know is that it’s OData that makes it possible for PowerPivot to consume data from SSRS reports and Sharepoint lists.
Just watching this video made my mind boggle with the possibilities of OData, although since I’m a long way from being an expert in this area I won’t bother to detail all of these fantasies as they’re probably rubbish. However, just consider how much easier life would be for the PowerPivot user of the future if the internet was full of sites that supported OData; certainly, when I’ve looked at the new UK government websites such as the recently-launched http://data.gov.uk/ and http://data.london.gov.uk/, or even sites like the Guardian Data Store that I blogged about last year, I’ve felt that the lack of a standard format to consume this data (a badly-formatted Google spreadsheet is not ideal) has seriously limited these sites’ usefulness. And what if Bing came up with a service like the sadly-useless Google Squared, where you could search for data on the web and return it in a structured, OData format?
I’d also love to see Analysis Services support OData too, in some shape or form. Perhaps it could be used to solve some of the same problems with XMLA that Julian Hyde lists in his recent post on xmla4js. Wouldn’t it also be cool if there was a standard interface for publishing an MDX query up to SSAS from any client tool, and then be able to consume it via OData (similar to what I suggested here)? You’d then enable scenarios like this: power user creates query using an advanced SSAS client tool, then publishes it up to SSAS whereupon it becomes something similar to a SQL stored proc with the results available as an OData feed, and therefore can be consumed either in your own code (eg in a website), by PowerPivot users, or by something like SSRS (which would then know nothing about the query used but would just be used to format the resultset).