Archive for the ‘Cloud’ Category
You may have already seen the announcement about Windows Azure Virtual Machines today; what isn’t immediately clear (thanks to Teo Lachev for the link) is that Analysis Services 2012 Multidimensional and Reporting Services are installed on the new SQL Server images. For more details, see:
SSAS 2012 Tabular is also supported but not initially installed.
Wait, I know what you’re thinking: Access, isn’t that dead yet? Well, no – and if you’ve been reading the blogs about Access 2013 that it’s undergone something of a transformation, one that’s very interesting from a BI point of view. The key change is mentioned here:
One of the biggest improvements in Access 2013 is one you may not even notice—except that you’re whole app will be faster, more reliable, and work great with large amounts of data. When Access databases are published to SharePoint—whether on-premise or through Office 365—a full-fledged SQL Server database is automatically created to store the data. Advanced users who are already familiar with SQL Server will be able to directly connect to this database for advanced reporting and analysis with familiar tools such as Excel, Power View, and Crystal Reports. Everyday users can rest assured that their apps are ready for the future if they ever need to enhance them with advanced integrations or migrations.
So while Access 2013 is still a desktop database, the Access Web App is essentially a power-user-friendly tool for creating SQL Server/Azure SQL database applications. As Tim Anderson points out here (in a post that’s also worth a read) there seems to be a bit of an overlap with LightSwitch; but that’s incidental here. The real point I wanted to make is that this is another key piece in the Microsoft self-service BI stack in place. By the time users are working with Office 2013 for real, I can imagine some quite sophisticated self-service BI solutions being built where data is loaded into a SQL Server database designed in Access (maybe using Data Explorer?) before it gets to Excel/PowerPivot, a much more robust approach than loading data direct from the original source into Excel/PowerPivot. I’m sure there’ll still be plenty of opportunity for SQL Server DBAs to look down on the work of Access developers, but it looks like this will give Access a new lease of life.
Unfortunately it looks like Access 2013 Web Apps won’t support OData just yet. Here’s a comment from Todd Haugen, a program manager on the Access team, on the first blog post referenced above:
Sorry to say we did not get to enable support for OData at RTM. This is a key area we are looking at for the next release. In the near-term SQL Azure will be turning on ODBC access which will allow you to hook Excel and PowerPivot together with Access. This feature will be available by RTM.
I had hoped to be able to write up a demo of PowerPivot connecting to a database created with the Access Web App, but this comment (and my inability to get it working, even though I can see the server name and database name I’d need to connect to in Access) means you’ll just have to imagine what might be possible…
Jamie Thomson and I share a number of… obscure enthusiasms. For instance, last week when he spotted the new forms/surveys feature in the Excel 2013 Web App (see here for a mention) he knew I’d be excited. And I was. Excited enough to devote a whole blog post to them.
What is this feature? Basically a rip-off of homage to the Google docs functionality I mentioned here that allows you to create simple questionnaires and save the data back to a spreadsheet. To use it you need to create a new Excel spreadsheet in the Excel Web App (I can’t seem to find it in desktop Excel and it may not even exist there), then click on Form/New Form in the ribbon:
This opens a new dialog where you can create your form/survey:
It’s all pretty self-explanatory from there, you just enter a title and description and then some questions, which can be various types (returning text, numbers, multiple choices etc):
You can then answer the questions yourself or send a link out to other people so they can too:
If you’d like to take the survey you can do so here btw.
The data then lands in a table in the original Excel spreadsheet, ready for you to do something useful with it:
For my next trick, and to go back to another issue that Jamie and I have been moaning about for years, I would have liked to consume the data in this table via an OData feed as detailed here:
Unfortunately I couldn’t get it to work. Whether this is a temporary problem or a limitation with Office 365 (as opposed to on-prem Sharepoint) I don’t know… if someone knows how to make it work, though, I’d be much obliged if you could leave a comment.
UPDATE: First of all, if you can’t see the survey don’t worry – the service seems to be very unreliable. Secondly I’ve got the OData feed working now and will blog about it later.
In part 1 of this series I showed how to build a BI dashboard with UK weather data in Excel 2013. What I’m going to do now is show, very quickly, how this dashboard can be shared with other people via a web browser.
Before I do that, though, an aside: I found a bug in the preview version of Excel 2013 which means that the dashboard I built in the previous post won’t work with what I’m going to describe here. It’s related to how I imported the data – in my last post I imported data from the Azure DataMarket from the Data tab, and what I’ve had to do for this post is import data from inside the PowerPivot window instead. In a way it’s a good thing I did this because the UI for importing data from the Azure DataMarket in PowerPivot is much better than what I showed before (although it too is quite buggy at the moment). No more fumbling around for account keys and OData queries in the way that Rob Collie complains about here, it’s all handled from within the wizard. All you need to do is pick a data set:
And then check the tables you want and optionally apply filters to certain columns that support it:
Anyway, with my workbook rebuilt, the next thing I’m going to do is save it. There’s actually a lot more to saving in Office 2013 than you might expect, and certainly lots of different options for places to save stuff too, so I recommend you read this post to get a feeling of what’s possible. I’ve signed up for the Office 365 Professional Plus Preview which means I have access to all kinds of cloud-based services including Sharepoint in the cloud, so I can save my workbook up to Sharepoint:
I can then go to my Sharepoint site on another computer and view and interact with my dashboard using Office Web Apps in the browser:
Yes, that’s my Power View dashboard in a web browser – and I can change filters and it all works! Here’s my PivotTable:
I can not only drill down, but I can even change what’s on rows and columns by dragging and dropping fields in the Field list in the usual way. It’s a bit slow and (again) a bit buggy at the moment, but by RTM I can see this being a big selling point: anyone with Office 365 will have in place everything they need not only for BI on the desktop, but for sharing their reports over the web (though data refresh may be a problem here). I’m very impressed, and it’s good to see such a strong answer to the question I asked here last year.
Finally, the question I know you’re all dying to ask. Does this work on an iPad? Well, yes and no. The PivotTable works properly and there’s no loss of interactivity, although I’d say that the touch interface doesn’t work all that smoothly yet:
Unfortunately the Power View sheet doesn’t work at all – no Silverlight!
Oh well, maybe that was too much to ask for… this is clearly not the Microsoft mobile BI solution that was promised at PASS last year. That’s not to say it isn’t useful though – there’s still some good stuff you can do with PivotTables and regular Excel sheets. I’m still very happy with what’s been delivered so far!
nsurprisingly there’s been a lot of interest in Office 2013 since it was announced yesterday, and I’m certainly very excited by all of the new BI features in it. Luckily I wasn’t working today so I had the chance to have a proper play with it, and I thought it would be useful to walk through the process of building a simple BI solution in Excel 2013 to show off some of the new features.
Let’s start with a blank workbook:
Some people don’t like the new look for Office, but I quite like it. You can even set custom backgrounds: you may just be able to see some wispy clouds in the top right-hand corner of the screenshot above. But anyway, to business. To import some data, I first need to go to the Data tab as normal:
As you can see there are some new options available here, and I’m going to go to the Windows Azure Marketplace. Actually, I’m going to cheat a little and just say that I’m going to import the UK weather forecast from 12:00am today (July 17th) along with some related tables as described in this blog post. The UI for the import wizard is plain but functional:
Interestingly, the data is always imported at the end of the wizard even if I check the ‘Only Create Connection’ option on the last step of the wizard.
Once the data from all the tables has been imported, I need to specify some relationships. I can either do this by clicking on the Relationships button in the Data tab or (better still) going to the PowerPivot window and using the Diagram View. Now as I said yesterday, xVelocity in-memory database is now integrated into Excel but I still need to use the PowerPivot addin in some cases; The PowerPivot addin comes bundled with Excel 2013 but isn’t enabled by default, so in order to use it I first need to enable it; the steps to do this are detailed here. I can then click on the Manage button in the PowerPivot tab to open the PowerPivot window:
And then build some relationships between my tables in Diagram View, which can be reached by clicking on the small Diagram button in the very bottom right hand corner of the PowerPivot window. Once in the Diagram View, building relationships is simply a matter of dragging one column onto another:
I can also build hierarchies here; Duncan Sutcliffe shows how to do this here, and also how to use hierarchies in Power View which is something new. I’ve created a hierarchy going from Region to Weather Station.
With that done I can of course build a PivotTable. The first thing I’ve noticed is that there isn’t now a separate type of PivotTable for PowerPivot, which is kind of a shame because this means that you don’t seem to be able to create Slicers in the Field List any more, and have to go up to the Ribbon. It’s good for consistency though I suppose. Measure aggregation and number formats are all handled in the Value Field Settings dialog:
There’s a new type of Slicer available too for dates called the TimeLine which Raphael has a lot of good detail on here.
Here’s what the new Quick Explore option, which provides some options on where to drill to and for charts to create, looks like when I hover over a Region on Rows:
Really, though, the thing to do is to create a Power View sheet. This can be accomplished by going to the Insert tab and clicking the Power View button:
This is where things get exciting. Power View in Excel now supports maps and because my weather data contains the latitude and longitude of all of the weather stations in the UK it’s very easy to plot each weather station on a map and visualise the temperature and predicted weather for each station. To do this I just need to drop the Region Name column onto the Filters selection (choosing a region reduces the number of Weather Stations displayed down to a manageable number for the purposes of this demo), then drop the Latitude and Longitude columns onto the main canvas, turn the resulting table into a map, and then set the size of the markers to show temperature and the colour to show the type of weather:
I’ll be speaking at SQL South West this week so what’s the weather like down there at the moment?
From this I can see that today there’s fog in the Scilly Islands (shown by the green marker in the bottom left hand corner) and that it’s very cold and foggy in Liscombe (shown by the small red marker in the middle of the map). This dashboard was ridiculously easy to create, by the way, and I am already completely in love with Excel 2013 simply because of Power View – it’s a killer feature in my opinion.
At this point we’ve got a working self-service BI solution, made very quickly in Excel 2013 and it’s something that every competent Excel user would be able to achieve.
In Part 2, you’ll see how you can share this solution with your co-workers in the Office Web App.
I was just reading the following blog post on the OASIS OData Technical Committee Call for Participation:
…when I saw this:
In addition to the core OData version 3.0 protocol found here, the Technical Committee will be defining some key extensions in the first version of the OASIS Standard:
OData Extensions for Data Aggregation - Business Intelligence provides the ability to get the right set of aggregated results from large data warehouses. OData Extensions for Analytics enable OData to support Business Intelligence by allowing services to model data analytic "cubes" (dimensions, hierarchies, measures) and consumers to query aggregated data
Follow the link in the quoted text – it’s very interesting reading! Here’s just one juicy quote:
OData services expose a data model that describes the schema of the service in terms of
the Entity Data Model (EDM), an Entity-Relationship model that describes the data and
then allows for querying that data. The responses returned by an OData provider are
based on that exposed data model and retain the relationships between the entities in the
model. Adding the notion of aggregation to OData, without changing any of the base
principles in OData as is, has two sides to it:
1. Means for the server to describe an “analytical shape” of the data represented by
2. Means for the client to query an “analytical shape” on top of any given data
model (for sufficiently capable servers/services)
It’s important to notice that, while each of these two sides might be valuable in its own
right and can be used independently of the other, their combination provides additional
value for clients. The descriptions provided by the server will help a consumer
understand more of the data structure looking at the service’s exposed data model from an
analytics perspective, whereas the query extensions allow the clients to express an
“analytical shape” for a particular query. The query extensions will also allow clients to
refer to the server-described “analytical shape” as shorthand.
One of the authors, Siva Harinath, I know from the SSAS dev team and it’s easy to see how this extension for OData would be incredibly useful for Microsoft BI. Not only could you imagine cloud-based SSAS (when it finally arrives) being queryable via OData, but it would allow an OData provider to not only of supply raw data to PowerPivot but the design of the PowerPivot model itself.
I don’t always agree with everything Rob Collie says, much as I respect him, but his recent post on the Windows Azure Marketplace (part of which used to be known as the Azure Datamarket) had me nodding my head. The WAM has been around for a while now and up until recently I didn’t find anything much there that I could use in my day job; I had the distinct feeling it was going to be yet another Microsoft white elephant. The appearance of the DateStream date dimension table (see here for more details) was for me a turning point, and a month ago I saw something really interesting: detailed weather data for the UK from the Met Office (the UK’s national weather service) is now available there too. OK, it’s not going to be very useful for anyone outside the UK, but the UK is my home market and for some of my customers the ability to do things like use weather forecasts to predict footfall in shops will be very useful. It’s exactly the kind of data that analysts want to find in a data market, and if the WAM guys can add other equally useful data sets they should soon reach the point where WAM is a regular destination for all PowerPivot users.
Importing this weather data into PowerPivot isn’t completely straightforward though – the data itself is quite complex. The Datamarket guys are working on some documentation for it but in the meantime I thought I’d blog about my experiences; I need to thank Max Uritsky and Ziv Kaspersky for helping me out on this.
The first step in the process of importing this data is to go to the Azure Marketplace and construct a query to get the slice of data that you want – this is a big dataset and you won’t want to get all of it. Once you’ve signed in, go to https://datamarket.azure.com/dataset/0f2cba12-e5cf-4c6d-83c9-83114d44387a, subscribe to the dataset and then click on the “Explore this Dataset” link:
This takes you to the query builder page, where you get to explore the data in the different tables that make up this dataset:
You choose the table you want to explore in the confusingly-named ‘Query’ dropdown box on the right-hand side of the screen. The available tables are:
- ThreeHourlyForecast, a fact tables containing three hourly weather forecasts
- ThreeHourlyForecastArchive, a fact table containing aggregated, averaged values for the various forecasts for a given date and time
- SignificantWeather, a dimension table containing the different types of weather that can be forecast
- Visibility, a dimension table containing the different levels of visibility
- DailyForecast, a fact table containing daily weather forecasts
- Observations, a fact table containing observed weather
- Site, a dimension table containing all the UK’s weather stations
As far as I can tell, this data is more or less the same as what’s available through the Met Office’s own DataPoint service, and the documentation for this is here: http://www.metoffice.gov.uk/public/ddc/datasets-documentation.html
Once you’ve selected a table you can construct a filter by entering values in the Optional Parameters boxes below the query dropdown. These changes are then reflected in the URL shown at the top of the screen:
This URL represents an OData query. One thing I didn’t notice initially is that the query that is generated here includes a top 100 filter in it which you’ll need to remove (by deleting &$top=100 from the end of the URL) if you want to do anything useful with the data; you might also want to build a more complex query than is supported by the query builder, and you can learn how to do this by reading this article.
In my case I decided to look at the full three hourly forecast data. As I said, this is a big dataset – initially I thought I’d download the whole thing, but 18 million rows and several hours later I cancelled the import into PowerPivot. Instead I opted to look at data for the whole of the UK for just one forecast made on one day, which worked out at a more manageable 250000 rows. What’s not clear from any of the current documentation is what all of the columns in the three hourly forecast fact table represent:
- Date is the date the forecast is issued
- StartTime is the time the forecast is issued and is either 0, 6, 12 or 18, representing midnight, 06:00, 12:00 and 18:00 – new forecasts are issued every 6 hours
- PredictionTime is the time that an incremental update to a forecast is issued; these incremental updates appear every hour. PredictionTime is an actual time value going from 0 to 23 representing the hour the incremental update was issued.
- TimeStep is an offset in hours from the StartTime, and represents the time that the forecast is predicting the weather for. It ranges in value from 0 to 120, going up in 3s (so the values go 0, 3, 6, 9… 120), meaning we have weather predictions for 5 days into the future for each forecast.
Therefore, for any given row in the ThreeHourlyForecast table, if the Date is April 10th 2012, StartTime is 6, PredictionTime is 8 and TimeStep is 9, then this is data from a forecast that was issued on April 10th 2012 at 8am (the second incremental update to the 6am forecast) and this row contains the prediction for the weather for the time StartTime+TimeStep = 6 + 9 = 15:00 on April 10th 2012.
Here’s the OData url I used to grab data for the three hourly forecast issued on April 10th at midnight (StartTime=0 and PredictionTime=0) for all weather stations and all time steps:
To use this URL in PowerPivot, you need to create a new PowerPivot workbook, open the PowerPivot window and then click the From Azure DataMarket button:
Then enter your query URL and Account Key (which you can find on the Query Explorer page by clicking on the Show link, as indicated in the screenshot above):
Having imported this data I also imported the whole of Site (renamed here to Weather Stations) and SignificantWeather tables to give the following PowerPivot model:
Here are the joins I used:
I also created a few calculated columns, including one called ActualPredictionForStart which added the TimeStep to the Start Time and the Date to get the actual date and time that the prediction is for:
=[Date] + (([StartTime] + [TimeStep])/24)
With this all done, I was able to find out what the predicted weather for the current time and my home town was in this (as of the time of writing) two-day old forecast:
…and do all the usual PivotTable-y and chart-y things you can do with data once it’s in Excel:
Incidentally, the forecast is wrong – it’s not raining outside right now!
PivotTables and Excel charts are all very well, but there’s a better way of visualising this data when it’s in Excel – and in my next post I’ll show you how…
UPDATE: First of all, I owe an apology to the Met Office – as soon as I hit publish on this post it started pouring with rain, so they were right after all. Secondly, in a weird co-incidence, Rob Collie posted about using the other weather dataset in the DataMarket on his blog: http://www.powerpivotpro.com/2012/04/download-10000-days-of-free-weather-data-for-almost-any-location-worldwide/
A few things got released in the past week or so that I would normally have devoted a short blog post to; now I’ve finally made it onto Twitter I tend to just tweet about them instead, but I still think a blogging is the best way for me to get my thoughts together about what they actually mean. Let’s consider this ‘stuff’ in isolation first:
- The first CTP of Hadoop on Azure went live. Denny Lee’s blog post is probably the best place to start to learn about it; the video in the post is a very useful quick overview of what it does too. I noticed that it supports importing data from the Windows Azure Marketplace (what used to be called the Azure Datamarket).
- The Azure Marketplace also got a speed upgrade, as Boyan Penev notes here. This is important because whenever I’d tried to use it in the past its appalling performance had stopped me doing anything interesting with it at all. This, plus the fact that you can now publish your own data there, turns what was something of a white elephant into what could be an integral part of the Microsoft cloud BI platform.
- Version 2 of Google BigQuery went live, although it’s still in beta.
- The CTP of Data Explorer got released, which of course I’ve blogged about already here and which Jamie blogged about here and here.
- Microsoft announced Office 365 compliance with leading EU and US standards for data protection and security, which means less of those legal worries about whether you’re allowed to put all that interesting data you want to use for BI into the cloud.
From this it’s clear that Microsoft’s cloud BI platform is beginning to take shape, as are competing cloud BI platforms (if we assume that Google actually has a cloud BI strategy, and I think it has), and I think it’s fair to say Microsoft is well placed. There’s also yet more evidence, as if it was not blindingly obvious already, that Excel is at the heart of Microsoft’s BI strategy. Look at how data from Hadoop on Azure can be imported directly into Excel, and how this is flagged up as an important selling point, in the video on Denny’s blog post. However I think Microsoft needs to make even more of an effort to make everything Excel-like: now that it’s building a new BI platform from scratch it has a very rare opportunity to do this and to produce a truly coherent set of tools rather than the traditional grab-bag of technologies that make up the current Microsoft BI stack and which the cloud platform could also end up as too. Actually I’d like to go further and say that rather than have a bunch of separate cloud BI tools MS should make everything BI a feature of Excel, whether it be Excel on the desktop or Excel in the cloud. This might seem a controversial thing to say, but if MS is committed to true mass-market, self-service BI then Excel has to be the platform and MS needs to base its cloud BI strategy on it 100%.
Here are a couple of the things I’d like to see happen to achieve this:
- Data Explorer’s expression language needs to be made consistent with Excel syntax, in the way that DAX is consistent with Excel syntax. While I like what Data Explorer can do at the moment I hate having to learn a new syntax and a new set of functions for each tool I use (think SSRS expressions, SSIS expressions, MDX and so on) and it makes me much less productive when I do. I want to use the same syntax to write formulas in my spreadsheet, calculations in DAX and calculations/transformations in Data Explorer too – after all, we want to do a lot of the same things in ETL and reporting like lookups, aggregation and pivoting.
- Hadoop on Azure is an important tool to have, not least because Hadoop is already so widely used in the wider world so it adds some credibility to the MS BI stack, but I’d like to be able to control massively parallel calculations from inside Excel and not just consume the results of them there. I’m thinking something like DataScope or the way you can scale out Excel calculations on Windows HPC, maybe driven though a PowerPivot-like interface with calculations expressed in DAX, or Data Explorer (which links back to my last point, because if Data Explorer expressions were DAX it needn’t be an either/or choice). It could of course still be Hadoop at the back-end with a layer on top to make it consistent with Excel.
- We desperately need a proper API for the Excel Web App (yes, here we go again…). An OData API for importing and exporting data from spreadsheets on the web is only the first step; in the long term I’d want the Excel Web App to have capabilities like Project Dirigible, so that Excel on the desktop could become a kind of client tool for a massively scalable Excel Server in the cloud (note that I didn’t say Excel Services, which is more like Excel-on-the-server than a real Excel Server in my opinion). I’d want to be able to store data in the cloud and automatically synchronise it with Excel on multiple desktops or elsewhere in the cloud. I’d also want to create Excel spreadsheets in the cloud that acted like functions (similar to Project Dirigible), where these functions could be called from other Excel spreadsheets which again could be on the desktop or be in the cloud.
- We need Analysis Services in the cloud. We’ve known it’s been coming for a long time, but not what form it will take. Again, I’d like to see tight integration with Excel similar to how PowerPivot works on the client or tighter, where Analysis Services would not be a separate service but just be the functionality within Excel for manipulating tables of data for reporting purposes. I’d want Data Explorer to be able to create and load these tables.
- We also need the full BI functionality of desktop Excel – pivot tables and cube formulas – in the Excel Web App. I would guess this is in the pipeline anyway since it’s a pretty obvious requirement if Microsoft’s cloud BI strategy is going to work. I also don’t see the point of Power View being a separate app – I’d like to see it become a data visualisation feature of Excel.
- Finally, I’d like to see some way of tying all the data held in this gigantic data store together. You’d want to be able to search it, certainly, but also understand where it’s come from and what it actually represents. It sounds like maybe this is what Project Barcelona is aiming for…
Coordinating the work of multiple teams at Microsoft in the way this would demand is a particularly thankless task, I know. But maybe, just maybe, someone senior (maybe a Technical Fellow like Amir…?) could pull off something this ambitious?
I’ve just seen there’s a new release on the SQL Azure Labs site – Codename “Social Analytics” (that’s a bad codename by the way – did MS finally run out of place names in Washington state?). Full details are here:
It’s a service that allows you to collect social web data and then either use it for business analysis or automate responses; it has an OData endpoint so the obvious tool for doing the analysis is PowerPivot, and I guess it could also be a data source for Data Explorer when that comes along. At the moment it’s very limited in that you can only analyse the results of one of two feeds that aggregate data about either Bill Gates or Windows 8, which is a shame, but it should still be fun to play with.
Jamie Thomson just tipped me off about something new and very interesting – one week after I had a moan about Microsoft doing nothing about Excel and the cloud, here comes Excel DataScope:
Here’s the blurb from the site:
From the familiar interface of Microsoft Excel, Excel DataScope enables researchers to accelerate data-driven decision making. It offers data analytics, machine learning, and information visualization by using Windows Azure for data and compute-intensive tasks. Its powerful analysis techniques are applicable to any type of data, ranging from web analytics to survey, environmental, or social data.
There are yet more tantalising details in the video and the two pdfs here:
I’m currently trying to find out more about all this, but there’s clearly a ton of cool stuff here:
- You can use the Live Labs Pivot control for visualisation.
- It does data mining in the cloud. Is this the successor to the old data mining addin? The functionality is clearly the same.
- There’s a section on Map Reduce running on Windows Azure on one of the posters. Is this using Dryad?
Is this a first glimpse of a new cloud-based BI platform from Microsoft? Will SSAS in the cloud form part of it? Before we all get too excited (or at least I get too excited) it’s worth noting that this is coming from the eXtreme Computing Group and not the SQL Server team, it’s clearly aimed at scientific rather than business users, and is described as “an ongoing research and development project”, ie it is not a commercial product. The potential is obvious though, and I hope it becomes something significant.