Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Importing UK Weather Data from Azure Marketplace into PowerPivot

with 18 comments

I don’t always agree with everything Rob Collie says, much as I respect him, but his recent post on the Windows Azure Marketplace (part of which used to be known as the Azure Datamarket) had me nodding my head. The WAM has been around for a while now and up until recently I didn’t find anything much there that I could use in my day job; I had the distinct feeling it was going to be yet another Microsoft white elephant. The appearance of the DateStream date dimension table (see here for more details) was for me a turning point, and a month ago I saw something really interesting: detailed weather data for the UK from the Met Office (the UK’s national weather service) is now available there too. OK, it’s not going to be very useful for anyone outside the UK, but the UK is my home market and for some of my customers the ability to do things like use weather forecasts to predict footfall in shops will be very useful. It’s exactly the kind of data that analysts want to find in a data market, and if the WAM guys can add other equally useful data sets they should soon reach the point where WAM is a regular destination for all PowerPivot users.

Importing this weather data into PowerPivot isn’t completely straightforward though – the data itself is quite complex. The Datamarket guys are working on some documentation for it but in the meantime I thought I’d blog about my experiences; I need to thank Max Uritsky and Ziv Kaspersky for helping me out on this.

The first step in the process of importing this data is to go to the Azure Marketplace and construct a query to get the slice of data that you want – this is a big dataset and you won’t want to get all of it. Once you’ve signed in, go to https://datamarket.azure.com/dataset/0f2cba12-e5cf-4c6d-83c9-83114d44387a, subscribe to the dataset and then click on the “Explore this Dataset” link:

SNAGHTML190a951

This takes you to the query builder page, where you get to explore the data in the different tables that make up this dataset:

image

You choose the table you want to explore in the confusingly-named ‘Query’ dropdown box on the right-hand side of the screen. The available tables are:

  • ThreeHourlyForecast, a fact tables containing three hourly weather forecasts
  • ThreeHourlyForecastArchive, a fact table containing aggregated, averaged values for the various forecasts for a given date and time
  • SignificantWeather, a dimension table containing the different types of weather that can be forecast
  • Visibility, a dimension table containing the different levels of visibility
  • DailyForecast, a fact table containing daily weather forecasts
  • Observations, a fact table containing observed weather
  • Site, a dimension table containing all the UK’s weather stations

As far as I can tell, this data is more or less the same as what’s available through the Met Office’s own DataPoint service, and the documentation for this is here: http://www.metoffice.gov.uk/public/ddc/datasets-documentation.html

Once you’ve selected a table you can construct a filter by entering values in the Optional Parameters boxes below the query dropdown. These changes are then reflected in the URL shown at the top of the screen:

image

This URL represents an OData query. One thing I didn’t notice initially is that the query that is generated here includes a top 100 filter in it which you’ll need to remove (by deleting &$top=100 from the end of the URL) if you want to do anything useful with the data; you might also want to build a more complex query than is supported by the query builder, and you can learn how to do this by reading this article.

In my case I decided to look at the full three hourly forecast data. As I said, this is a big dataset – initially I thought I’d download the whole thing, but 18 million rows and several hours later I cancelled the import into PowerPivot. Instead I opted to look at data for the whole of the UK for just one forecast made on one day, which worked out at a more manageable 250000 rows. What’s not clear from any of the current documentation is what all of the columns in the three hourly forecast fact table represent:

  • Date is the date the forecast is issued
  • StartTime is the time the forecast is issued and is either 0, 6, 12 or 18, representing midnight, 06:00, 12:00 and 18:00 – new forecasts are issued every 6 hours
  • PredictionTime is the time that an incremental update to a forecast is issued; these incremental updates appear every hour. PredictionTime is an actual time value going from 0 to 23 representing the hour the incremental update was issued.
  • TimeStep is an offset in hours from the StartTime, and represents the time that the forecast is predicting the weather for. It ranges in value from 0 to 120, going up in 3s (so the values go 0, 3, 6, 9… 120), meaning we have weather predictions for 5 days into the future for each forecast.

Therefore, for any given row in the ThreeHourlyForecast table, if the Date is April 10th 2012, StartTime is 6, PredictionTime is 8 and TimeStep is 9, then this is data from a forecast that was issued on April 10th 2012 at 8am (the second incremental update to the 6am forecast) and this row contains the prediction for the weather for the time StartTime+TimeStep = 6 + 9 = 15:00 on April 10th 2012.

Here’s the OData url I used to grab data for the three hourly forecast issued on April 10th at midnight (StartTime=0 and PredictionTime=0) for all weather stations and all time steps:

https://api.datamarket.azure.com/Data.ashx/DataGovUK/MetOfficeWeatherOpenData/ThreeHourlyForecast?$filter=Date%20eq%20datetime%272012-04-10T00%3a00%3a00%27%20and%20PredictionTime%20eq%200

To use this URL in PowerPivot, you need to create a new PowerPivot workbook, open the PowerPivot window and then click the From Azure DataMarket button:

image

Then enter your query URL and Account Key (which you can find on the Query Explorer page by clicking on the Show link, as indicated in the screenshot above):

image

Having imported this data I also imported the whole of Site (renamed here to Weather Stations) and SignificantWeather tables to give the following PowerPivot model:

image

Here are the joins I used:

image

I also created a few calculated columns, including one called ActualPredictionForStart which added the TimeStep to the Start Time and the Date to get the actual date and time that the prediction is for:

=[Date] + (([StartTime] + [TimeStep])/24)

With this all done, I was able to find out what the predicted weather for the current time and my home town was in this (as of the time of writing) two-day old forecast:

image

…and do all the usual PivotTable-y  and chart-y things you can do with data once it’s in Excel:

image

Incidentally, the forecast is wrong – it’s not raining outside right now!

PivotTables and Excel charts are all very well, but there’s a better way of visualising this data when it’s in Excel – and in my next post I’ll show you how…

 

UPDATE: First of all, I owe an apology to the Met Office – as soon as I hit publish on this post it started pouring with rain, so they were right after all. Secondly, in a weird co-incidence, Rob Collie posted about using the other weather dataset in the DataMarket on his blog: http://www.powerpivotpro.com/2012/04/download-10000-days-of-free-weather-data-for-almost-any-location-worldwide/

Written by Chris Webb

April 12, 2012 at 4:12 pm

Posted in Cloud, PowerPivot

Tagged with

18 Responses

Subscribe to comments with RSS.

  1. Great read; I used your post as much as a Marketplace tutorial as I did anything else. The current Marketplace weather offerings for the U.S. are predictably paltry (and expensive!) — though I look forward to this evolving as time goes on.

    Please keep your ear to the Marketplace ground for us! Let us know if you hear of any new or interesting datasets that might be useful to the BI community at large.

    Shawn A

    April 12, 2012 at 7:32 pm

    • Chris, I’ve been experimenting with Marketplace (and a lot of the new 2012 tools) all day, using your blog post as a reference, and I’m trying to determine how one might circumvent PowerPivot and import this information directly into SQL Server. Am I thinking about things wrong? Do I need to have an Azure account to proceed in that direction? If so, it would surprise me, considering I’m able to access this UK weather data so easily via PowerPoint.

      Shawn A

      April 12, 2012 at 10:59 pm

      • Hi Shawn,

        You can also export data from the Datamarket in CSV format – just click on the Export button in the Explore screen. That should be the easiest way of getting the data into SQL Server. You don’t need an Azure account to use this data, you just need a Windows Live ID.

        Chris

        Chris Webb

        April 13, 2012 at 9:43 am

      • Hi Chris,

        (I’m replying to my own question, as it appears nested comments stop three levels down.)

        Going the CSV route makes sense for data such as DateStream, which would be a one-off pull — but when with forecast data, which by definition is quite volatile and must be pulled often, I would very much want to automate that feed for use in downstream scorecards.

        Might I then infer that I, the BI developer, am not the primary demographic for WAM? Or is it more accurate to assume they are forcing my hand to sign up for an Azure account?

        -Shawn

        Shawn A

        April 13, 2012 at 7:05 pm

      • Hi Shawn,

        I don’t know what you mean exactly by an ‘Azure account’ – the only account you need is a Windows Live ID. For developers it’s pretty easy to use OData in code for automating updates – there are lots of examples here: http://www.odata.org/

        Chris

        Chris Webb

        April 13, 2012 at 7:42 pm

  2. Chris, interesting arcticle. I wanted to play around with it a little bit without using PowerPivot. Is it possible to retrieve the raw OData by calling the url? If I log in to the marketplace with my WindowsLiveID and then call the url “https://api.datamarket.azure.com/Data.ashx/DataGovUK/MetOfficeWeatherOpenData/Observation?$top=100″ it does not work. I’m asked for a Login. Perhaps I have to add the Account key to the query in some way?

    Frank

    Frank

    April 13, 2012 at 10:40 am

    • Probably you do need to include the account key somehow, but I don’t know OData well enough to be able to tell you how. Maybe this will help? http://msdn.microsoft.com/en-us/library/gg193412.aspx

      Chris Webb

      April 13, 2012 at 10:58 am

      • Thank You for the link. One has to set the credentials correctly.

        A small program works like this:

        1. Create a C# console application and add a service reference for the Service-Uri “https://api.datamarket.azure.com/DataGovUK/MetOfficeWeatherOpenData/”. In my example the name of this reference is “ServiceReference1″.

        2. This code reads the first 100 observations:

        Uri ServiceUri = new Uri(“https://api.datamarket.azure.com/DataGovUK/MetOfficeWeatherOpenData/”);

        ServiceReference1.DataGovUKMetOfficeWeatherOpenDataContainer Context;

        Context = new ServiceReference1.DataGovUKMetOfficeWeatherOpenDataContainer(ServiceUri);

        Context.Credentials = new NetworkCredential(MY_WINDOWS_LIVE_ID, PRIMARY_ACCOUNT_KEY_OF_THE_QUERY);

        Uri QueryUri = new Uri(“https://api.datamarket.azure.com/Data.ashx/DataGovUK/MetOfficeWeatherOpenData/Observation?$top=100″);

        IEnumerable Query;

        Query = Context.Execute(QueryUri);

        IList List = Query.ToList();

        Frank

        April 13, 2012 at 2:08 pm

  3. [...] is that you won’t be able to do much outdoors if it’s raining. I wonder where I can get some weather data and add that to my map? We’ll find out how in part 2… Share this:FacebookTwitterLike [...]

  4. [...] Importing UK Weather Data from Azure Marketplace into PowerPivot by Chris Webb. [...]

  5. [...] Importing UK Weather Data from Azure Marketplace into PowerPivot [...]

  6. [...] data from Excel into Layerscape, the new data visualisation tool from Microsoft Research; in the post before that I showed how to load UK weather data from Windows Azure Datamarket into PowerPivot. Now let’s [...]

  7. How is Powerpivot different than Datascope?

    Cary Walker

    May 10, 2012 at 1:42 pm

    • In a couple of ways. First and foremost, PowerPivot is a fully supported product. Secondly, PowerPivot stores all of its data in-memory locally; DataScope, as I understand it, is all about keeping data in the cloud.

      Chris Webb

      May 10, 2012 at 8:10 pm

  8. [...] UK weather forecast from 12:00am today (July 17th) along with some related tables as described in this blog post. The UI for the import wizard is plain but [...]

  9. [...] of posts explaining how to use the Azure Marketplace website to import data into PowerPivot (see here for example) so I won’t go into too much detail at this point, but here’s an example Url that [...]

  10. I’m preparing for demo and thought it would be nice to embed external source on share point with weather info… and here I am ;) Thanks

    Emil Glownia

    December 4, 2012 at 2:07 pm

  11. [...] can be easily consumed using Microsoft Powerpivot for Excel. Here is an article by Chris Webb on Importing UK Weather Data from Azure Marketplace into PowerPivot. Once the weather data is imported into Excel, you can then import your sales data into excel and [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,302 other followers

%d bloggers like this: