Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Analysing Voting In the Power BI Competition – With Power BI!

with 8 comments

As I mentioned the other day, the public voting round of the Power BI competition has now started and the entrants are now competing for a place in the top ten and a whole bunch of cool prizes including XBoxes and Surface Pros. Having entered a demo myself I’m naturally very keen to see how I’m doing, but unfortunately doing this is a bit of a pain – you have to go to www.powerbicontest.com, click on the Entries tab, sort them, find your entry, and so on… So I thought to myself, isn’t there a better way? And of course there is… using Power BI!

Power Query is of course the tool to use to scrape the voting data from the Power BI site. It wasn’t straightforward to do but on the other hand it wasn’t impossible: there’s no single table of results, and indeed the results are spread over four different pages. To load the data into the Excel Data Model I:

  • Created a Power Query function to scrape the data from the first page
    • First I created a query to scrape the data from the first page using the “From Web” button
    • I then did a *lot* of searching around in the HTML to get the data (for this reason I’m not going to paste the code of the function here because it’s ugly)
    • I then deleted a lot of columns, created some new custom columns, and pivoted the data until in was in a nice tabular form
    • Finally, once I was sure the query was working properly I turned it into a function that could get data from any page
  • Next I created a table with five rows in it, called the function five times, once for each row, to get the data from each page
  • Then I merged all the data into a single table of results
  • I created another function to calculate the rank of an entry (which itself was an interesting challenge) and added a rank column to the merged table
  • Last of all, I created some Power View sheets to analyse the data

You can download the entire workbook here. I’ll warn you, the code isn’t pretty but it does the job and it’s got a few interesting features in for all you Power Query fans.

Now let’s have a look at what the data shows.

At the time of writing this I’m currently in 16th place – and not in line for an XBox alas. You can of course change this by

CLICKING HERE AND PRESSING THE ‘VOTE FOR THIS ENTRY’ BUTTON

and that’s even if you’ve voted for me already – you can vote once every 24 hours! But I digress…

The obvious first thing to do was to create a leader board sorted by rank:

image

As you can see, at the moment Carlos Costa is way out in front with a massive 180 votes; after that there’s a group of people fighting for a top five finish; and after that there’s a large group of people who might just squeeze into the lower reaches of the top ten.

Here’s a column chart showing this more clearly:

image

What are the top-ranked guys doing to get so many votes? Is it something to do with the number of people seeing their entry? If you put votes and views next to each other, it’s not easy to say what’s going on:

image

I’ve got the second largest number of views, but I’m in the middle of the pack. The problem for me is that I was one of the first people to enter and most of my views were well before voting opened; unfortunately there’s no way of knowing how many views each entry had after voting opened so I can’t say for sure what’s going on here. Certainly Alexander Pinkus submitted his demo “Few Facts About Dinosaurs” around the same time as me, has around the same number of views, and he’s currently ranked number 5. From that I can only deduce that his demo is better than mine (and having watched it I can say it is very good); he certainly has more exciting subject matter. Who doesn’t love dinosaurs? There are several other demos that have got particularly eye-catching subject matter and/or titles and they’re also doing well.

Here are views and votes plotted on a scatter chart, with the top 10, 20, 30 and so on shown in different colours:

image

I think this makes it a little easier to disregard outliers like me, and it’s probably fair to say that there is some kind of link between views and votes, at least for those in the top 20.

Now here’s one last column graph, showing the average number of views per vote for the top 20 ranked entrants:

image

You can see how badly I’m doing in this respect, despite my begging and pleading both here and on Twitter; conversely you can see that Mike Tetreault at the other end is obviously very good at getting his vote out.

I would like to say at this point that it’s tempting to moan and complain that people have ‘cheated’ and got their friends, family and colleagues to vote for them. To be honest, I think that everyone who’s got any significant number of votes will have ‘cheated’ like this to some extent. I certainly have – I’ve used my blog and other social media to try to get as many votes as possible. Indeed this post itself is a ruse to try to get more votes! At the end of the day the whole point of this competition is for Microsoft to get as many people as possible to see Power BI, so more people want to buy it. Therefore this mad scramble by entrants to get as many votes as possible will benefit all of us in the end.

Anyway, by the time you read this post there’s a strong chance that the patterns here will have changed completely. It’s a shame there isn’t more data available to play with – it would be great to have the time and date of each vote cast, and even the location of the person casting the vote. Given that you have to have a Facebook account to vote I suspect that someone somewhere does have all of this data, and more… so I wonder if they’re using Power BI too?

Written by Chris Webb

January 22, 2014 at 10:18 pm

Posted in Power BI

8 Responses

Subscribe to comments with RSS.

  1. I’m regular reader of your blog. It’s valuable and pleasure to read. You have my (polish) vote!

    Michał

    January 23, 2014 at 7:46 am

  2. Having looked at a few of your competitors’ presentations I have to pay you respect that you haven’t used any annoying background music. In my opinion data analysis and jingle music does not fit together well. If I had a facebook account you would have my vote.

    Martin Guth

    January 24, 2014 at 12:59 pm

  3. I voting for your entry and entry about dinosaurs every day )))

    Vitaly Popov

    January 25, 2014 at 11:42 am

  4. The truth is, Chris, if many of your blog visitors were ardent facebookers, you would have got many more votes. Let me speak for myself, I hardly visit Facebook so I find it awkward that MS has decided to use it.
    I will still go there and cast my vote, for you of course. Your work deserves my vote, I’ve seen it and it’s great.

    Enemona

    January 30, 2014 at 6:10 am

    • Thanks! The need for a FB account has cost me a lot of votes, I know. I’m in the top 10 now and it’s so close to the end….

      Chris Webb

      January 30, 2014 at 9:41 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,146 other followers

%d bloggers like this: