Power BI Review, Part 2.5: Q&A (Now I’ve Used It With My Data)
A few months ago I posted a review of Q&A, the natural language query functionality in Power BI, based on the sample data sets that were then available. Last week, finally, we got the news that we could enable Q&A on our own Power Pivot models, and having played with this new release I thought it was a good idea to post an update to my original thoughts.
The first thing to point out is that even if you’ve got a Power BI Preview tenant you will need the latest version of Power Pivot for Excel to be able to get the best out of Q&A. This latest release contains some new functionality to add ‘Synonyms’ to the model – what this means is that it allows you, as a model creator, to tell Power BI about other names that end users might use when querying your model. For example on a Geography dimension you might have a column called State but if you are a multinational company you may find that while your State column contains the names of states in the USA, it might contain the names of cantons in Switzerland, counties in the UK, departments in France and so on. As a result you will want Power BI to know that if a user asks for sales by county in the UK that it should actually look in the State column. Devin Knight has already written a good post showing how synonyms work with Q&A which you can see here.
Another complication is that, at the time of writing, the Synonym functionality is only available to users who have installed the streamed version of Office 2013 from Office 365. I have an Office 365 subscription but I had installed Office from an msi before that, so I had to uninstall Office and reinstall the streamed version to be able to see Synonyms – I assume that support for Synonyms in the non-streamed version of Excel will come at some point soon in the future, but in general I would expect that new Power BI functionality will appear first in the streamed version of Office first so if you’re serious about BI you should change over to it as soon as you can. Melissa Coates has a lot more detail on this issue here.
But enough about setup, what about Q&A? The data that I tested it on was a model I’ve been using for user group and conference demos for about six months now, which contains data from the UK’s Land Registry and details all residential property transactions in England and Wales in 2013. It’s fairly simple – two tables, a date table and a transactions table containing around 0.5 million rows – so probably a lot simpler than the average Power Pivot model, but nonetheless real data and one which had been polished for demo purposes. The Excel file holding it is around 25MB so I was well within the Power BI file size limits.
My initial impression after I had added my existing model (with no synonyms etc) to Q&A was that while it worked reasonably well, it worked nowhere near as well as the demo models I had seen. I then set about making changes to the model and re-uploading it, and these changes made all the difference. Some examples of the things I did are:
- Changed table and column names. In my model I had already taken the trouble to make them human readable, but this did not necessarily mean they were suitable for Q&A. For example, my main fact table was called ‘Land Registry’, so at first Q&A kept suggesting questions like “How many land registries were there in June…” which clearly makes no sense. Renaming the fact table to ‘Sales’ fixed this.
- Setting synonyms. Unsurprisingly, this had a big impact on usability in the same way that changing the table and column names did. I found that I had to go through several iterations of uploading the data, writing questions, seeing what worked and what didn’t, and adding more synonyms before I had a set that I was happy with; I can imagine that in the real world you’d need to round up several end users and lock them in a room to see how they phrased their questions so as to get a really good list of synonyms for them.
- Setting Power View-related properties. This included setting the Default Field Set on a table, so I only saw a few important fields in a meaningful order when Q&A returned a table result; and also Summarize By so that Q&A didn’t try to aggregate year values. All of this makes sense given how closely-related Q&A and Power View are, but even though I had a reasonably ‘finished’ model to start off with I still hadn’t set all of these properties because I knew I was never going to try to sum up a year column.
- Adding new columns. There were a number of cases where I realised that I, as a human user, was able to make assumptions about the data that Q&A could not. For example the source data records sales of four different types of residential property: terraced, detached, semi-detached and flat. The first three are types of house, but the source data doesn’t actually state that they are types houses anywhere so in order to see the total number of sales of houses I had to add another column to explicitly define which property types were houses.
- Disambiguation. Probably the most irritating thing about the Bing geocoding service that Power View and Q&A use is the way it always chooses a US location when you give it an ambiguous place name. Therefore when looking at sales by town I would see the town name “Bristol” show up on the map as Bristol, Tennessee (population 24,821) rather than Bristol in England (population 416,400). Creating a new column with town name and country concatenated stopped this happening.
The Microsoft blog post I referenced above announcing Q&A promises that a more detailed guide to configuring models for Q&A will be published soon, which is good news. The important point to take away from this, though, is that even the most polished Power Pivot models will need additional tweaks and improvements in order to get the best out of Q&A.
The big question remains, though, whether Q&A will be something that end users actually get some value from. As a not-very-scientific test of this I handed my laptop over to my wife (who has no experience of BI tools but who has a healthy interest in property prices) to see how easy it was for her to use, and straight away she was able to write queries and find the information she was looking for, more or less. There were a still few cases where Q&A and/or my model failed, such as when she searched for “average house price in Amersham” – the model has a measure for “average price”, it knows about the property type “house” and the town “Amersham”, but “average house price” confused it and the query had to be rewritten as “average price of a house in Amersham”. Overall, though, I was pleasantly surprised and as a result I’m rather less sceptical than I was about Q&A’s usefulness, even if I’m still not 100% convinced yet.