Archive for the ‘Power BI’ Category
Anyone who has tried to do any serious work with Power Pivot and Power Query will know about this problem: you use Power Query to load some tables into the Data Model in Excel 2013; you make some changes in the Power Pivot window; you then go back to Power Query, make some changes there and you get the dreaded error
We couldn’t refresh the table ‘xyz’ from the connection ‘Power Query – xyz’. Here’s the error message we got:
COM Error: Microsoft.Mashup.OleDbProvider; The query ‘xyz’ or one of its inputs was modified in Power Query after this connection was added. Please disable and re-enable loading to the Data Model for this query..
This post has a solution for the same problem in Excel 2010, but it doesn’t work for Excel 2013 unfortunately. There is a lot of helpful information out there on the web about this issue if you look around, though, and that’s why I thought it would be useful to bring it all together into one blog post and also pass on some hints and tips about how to recover from this error if you get it. This is the single biggest source of frustration among the Power Query users I speak to; a fix for it is being worked on, and I hope it gets released soon.
Why does this problem occur? Let’s take a simple repro.
- Import the data from a table in SQL Server using Power Query. Load it into the Excel Data Model.
- Open the PowerPivot window in Excel, then create measures/calculated fields, calculated columns, relationships with other tables as usual.
- Go back to the worksheet and build a PivotTable from data in this table, using whatever measures or calculated columns you have created.
- Go back to the PowerPivot window and rename one of the columns there. The column name change will be reflected in the PivotTable and everything will continue to work.
- Re-open the Power Query query editor, and then rename any of the columns in the table (not necessarily the one you changed in the previous step). Close the query editor window and when the query refreshes, bang! you see the error above. The table in the Excel Data Model is unaffected, however, and your PivotTable continues to work – it’s just that now you can’t refresh the data any more…
- Do what the error message suggests and change the Load To option on the Power Query query, unchecking the option to load to the Data Model. When you do this, on the very latest build of Power Query, you’ll see a “Possible Data Loss” warning dialog telling you that you’ll lose any customisations you made. Click Continue, and the query will be disabled. The destination table will be deleted from your Excel Data Model and your PivotTable, while it will still show data, will be frozen.
- Change the Load To option on the query to load the data into the Excel Data Model again. When you do this, and refresh the data, the table will be recreated in the Excel Data Model. However, your measures, calculated columns and relationships will all be gone. What’s more, although your PivotTable will now work again, any measures or calculated columns you were using in it will also have gone.
- Swear loudly at your computer and add all the measures, calculated columns and relationships to your Data Model all over again.
So what exactly happened here? The important step is step 4. As Miguel Llopis of the Power Query team explains here and here, when you make certain changes to a table in the Power Pivot window the connection from your Power Query query to the Excel Data Model goes into ‘read-only’ mode. This then stops Power Query from making any subsequent changes to the structure of the table.
What changes put the connection to the Excel Data Model in ‘read-only’ mode?
Here’s a list of changes (taken from Miguel’s posts that I linked to above) that you can make in the PowerPivot window that put the connection from your query to the Data Model into ‘read-only’ mode:
- Edit Table Properties
- Column-level changes: Rename, Data type change, Delete
- Table-level changes: Rename, Delete
- Import more tables using Power Pivot Import Wizard
- Upgrade existing workbook
How can you tell whether my connection is in ‘read-only’ mode?
To find out whether your connection is in ‘read-only’ mode, go to the Data tab in Excel and click on the Connections button. Then, in the Workbook Connections dialog you’ll see the connection from Power Query to the Data Model listed – it will be called something like ‘Power Query – Query1’ and the description will be ‘Connection to the Query1 query in the Data Model’. Select this connection and click on the Properties button. When the Connection Properties dialog opens, go to the Definition tab. If the connection is in read-only mode the properties will be greyed out, and you’ll see the message ‘Some properties cannot be changed because this connection was modified using the PowerPivot Add-In’. If you do see this message, you’re already in trouble!
How to avoid this problem
Avoiding this problem is pretty straightforward: if you’re using Power Query to load data into the Excel Data Model, don’t make any of the changes listed above in the PowerPivot window! Make them in Power Query instead.
How to recover from this problem
But what if your connection is already in ‘read-only’ mode? There is no magic solution, unfortunately, you are going to have to rebuild your model. However there are two things you can do to reduce the amount of pain you have to go through to recreate your model.
First, you can use the DISCOVER_CALC_DEPENDENCY DMV to list out all of your measure and calculated column definitions to a table in Excel. Here’s some more information about the DMV:
To use this, all you need to do is to create a DAX query table in the way Kasper shows at the end of this post, and use the query:
select * from $system.discover_calc_dependency
Secondly, before you disable and re-enable your Power Query query (as in step 6 above), install the OLAP PivotTable Extensions add-in (if you don’t already have it) and use its option to disable auto-refresh on all of your PivotTables, as described here:
Doing this prevents the PivotTables from auto-refreshing when the table is deleted from the Data Model when you disable the Power Query query. This means that they remember all of their references to your measures and calculated columns, so when you have recreated them in your Data Model (assuming that all of the names are still the same) and you re-enable auto-refresh the PivotTables will not have changed at all and will continue to work as before.
[After writing this post, I realised that Barbara Raney covered pretty much the same material in this post: http://www.girlswithpowertools.com/2014/06/power-query-refresh-fails/ . I probably read that post when it was published and then forgot about it. I usually don't blog about things that other people have already blogged about, but since I'd already done the hard work and the tip on using OLAP PivotTable Extensions is new, I thought I'd post anyway. Apologies...]
You probably know that, when you are importing data from multiple tables in SQL Server into the Excel Data Model in Excel 2013 using Power Query, Power Query will automatically create relationships between those tables in the Data Model. But did you know that you can get Power Query to do this for other data sources too?
Now wait – don’t get excited. I’ve known about this for a while but not blogged about it because I don’t think it works all that well. You have to follow some very precise steps to make it happen and even then there are some problems. That said, I think we’re stuck with the current behaviour (at least for the time being) so I thought I might as well document it.
Consider the following Excel worksheet with two tables in it, called Dimension and Fact:
If you were to load these two tables into the Excel Data Model, you would probably want to create a relationship between the two tables based on the FruitID column. Here are the steps to use Power Query to create the relationship automatically:
- Click inside the Dimension table and then, on the Power Query tab in the Excel ribbon, click the From Table button to create a new query.
- When the Query Editor window opens, right click on the FruitID column and select Remove Duplicates.
Why are we doing this when there clearly aren’t any duplicate values in this column? The new step contains the expression
…and one of the side-effects of using Table.Distinct() is that it adds a primary key to the table. Yes, tables in Power Query can have primary keys – the Table.AddKey() function is another way of doing this. There’s a bit more information on this subject in my Power Query book, which I hope you have all bought!
- Click the Close & Load to.. button to close the Query Editor, and then choose the Only Create Connection option to make sure the output of the query is not loaded anywhere and the query is disabled, then click the Load button. (Am I the only person that doesn’t like this new dialog? I thought the old checkboxes were much simpler, although I do appreciate the new flexibility on where to put your Excel table output)
- Click inside the Fact table in the worksheet, click the From Table button again and this time do load it into the Data Model.
- Next, in the Power Query tab in the Excel ribbon, click the Merge button. In the Merge dialog select Dimension as the first table, Fact as the second, and in both select the FruitID column to join on.
- Click OK and the Query Editor window opens again. Click the Close & Load to.. button again, and load this new table into the Data Model.
- Open the Power Pivot window and you will see that not only have your two tables been loaded into the Data Model, but a relationship has been created between the two:
What are the problems I talked about then? Well, for a start, if you don’t follow these instructions exactly then you won’t get the relationship created – it is much harder than I would like. There may be other ways to make sure the relationships are created but I haven’t found them yet (if you do know of an easier way, please leave a comment!). Secondly if you delete the two tables from the Data Model and delete the two Power Query queries, and then follow these steps again, you will find the relationship is not created. That can’t be right. Thirdly, I don’t like having to create a third query with the Merge, and would prefer it if I could just create two queries and define the relationship somewhere separately. With all of these issues I don’t think there’s any practical use for this functionality right now.
I guess the reason I think the ability to create relationships automatically is so important is because the one thing that the Excel Data Model/Power Pivot/SSAS Tabular sorely lacks is a simple way to script the structure of a model. Could Power Query and M one day be the modelling language that Marco asks for here? To be fair to the Power Query team this is not and should not be their core focus right now: Power Query is all about data acquisition, and this is data modelling. If this problem was solved properly it would take a lot of thought and a lot of effort. I would love to see it solved one day though.
You can download the sample workbook for this post here.
Seems like another new bit of Power BI functionality got released today: the ability to optimize your data model for Q&A in the browser. Here’s the link to the docs:
Previously, the ability to add synonyms to your model to improve the results you got from Q&A was only available in Excel on the desktop, inside the Power Pivot window. Now you can do this, as well as new stuff like add phrasings (described here) and view usage reports, in your Power BI site.
I won’t repeat what the docs say about the actual functionality, but this seems to be yet more evidence that Excel on the desktop is no longer the central hub for Power BI. If this is the case, this is a massive strategic change, and I can understand why it has happened: the need for the ‘right’ version of Excel on the desktop is a massive roadblock for Power BI adoption, especially in enterprise accounts (see also Jen Underwood’s comments on this from yesterday). Maybe now it’s BI in the browser instead?
OK, so I’m not at WPC this year but I have just watched this video of Scott Guthrie’s session “The Cloud for Modern Business”. If you’re interested in seeing some new Power BI features take a look at the demo by James Phillips, general manager for Power BI, starting at 21:20:
Some of the new things I noticed:
- 21:40 – a nice shot of one of the new Power BI dashboards first announced at the PASS BA Conference earlier this year. You can see several new types of visualisation such as treemaps, radar charts and gauges (gauges? GAUGES? Shhh, don’t tell Stephen Few).
- 22:33 – a list of out-of-the box data sources is shown from which new models can be created. They include: Salesforce, MS Dynamics, Facebook, Google Analytics, Twitter, and Upload Excel.
- 22:50 – data is imported from Salesforce in the browser. This isn’t happening in Excel on the desktop, folks, it’s in the browser. This is significant!
- 23:10 – another new visualisation shown, a doughnut chart (if that’s the right term). I see names of people from the Power Query team in the data.
- 24:50 – a Q&A analysis is pinned to the dashboard
- 25:50 – much is made of the fact that the dashboard is touch-enabled
- 25:55 – “Partner Solution Packs” are announced. This sounds important! It seems to be referring to the Salesforce demo earlier, and these solution packs are said to include: data, connectivity to the data sources, visualisation and interactive reports. So it sounds like Microsoft are going to encourage data vendors (or other sources of data) to build these solution packs on top of Power BI as pre-packaged analytical apps. Probably a good idea.
- 26:15 – editing a dashboard in the browser and swapping one visualisation for another. Again, the HTML 5 browser based editing experience – we haven’t seen Excel once in this demo.
- 27:55 – “If there was ever a partner opportunity, this is it”. Again much emphasis here. Seems like these new Power BI features, especially the solution packs, are aimed at giving partners incentives to sell and customise Power BI (something which they have not had up to now, to be honest).
Oh, and you probably already heard that Azure Machine Learning is now in public preview. Check out the docs and samples here. I wouldn’t be surprised if there was some integration between this and Power BI to come too.
Looking for some summer holiday (or winter holiday, depending on which hemisphere you live in) reading? If so, may I suggest my new Power Query book? “Power Query for Power BI and Excel” is available now from the Apress site, Amazon.com, Amazon.co.uk and all good bookstores.
It’s an introductory level book. It covers all of the stuff you can do in the UI, it has a chapter on M, and it goes into a reasonable amount of detail on more advanced topics; it is not a 500-page exhaustive guide to the product. I’ve focused on readability and teaching the fundamentals of Power Query rather than every looking at every obscure M function, but at the same time if you’ve already used Power Query I think there’ll be plenty of material in there you’ll find interesting.
Now for the bad news: the book is out-of-date already, although not by much. One of the best things about Power Query is the monthly release cycle; unfortunately that makes writing a book on it a bit of a nightmare. I started off writing in January and had to deal with lots of added functionality and changes to the UI over the next few months; I had to retake pretty much all of the screenshots as a result. The published version of the book is based on the version of Power Query that was released in early June rather than the current version. Hopefully you can forgive this – the differences are minor – but it’s a good reason to buy the book as soon as you can! I want to do a second edition in a year’s time once (if?) the release cycle slows down.
I’ve been teased a bit for blogging and teaching so much about Power Query recently, so the final thing I want to say here is why an old corporate BI/SSAS guy like me is getting so excited about a self-service ETL tool. Well, the main reason is that Power Query is a great piece of software. It does what it does very well; it does useful things rather than what the marketing guys/analysts/journalists think is hot in BI; it is easy to use but at the same time is flexible enough for the advanced user to do really complex stuff; it is updated regularly based on feedback from its users. I only wish all Microsoft software was this good… Honestly, I wouldn’t be able to motivate myself to blog and write about Power Query if I didn’t think it was cool, and even though it hasn’t been hyped in the same way as other parts of the Power BI stack it is nonetheless the part that people get excited about when I show them Power BI. It’s not just me either – every day I see positive comments like Greg Low’s here. I think it is as important, if not more important, than Power Pivot and I think it will be a massive success.
Oh, and did I mention that I’m also teaching a Power Query course in London later this year….?
Last week someone asked me whether it was possible to do the equivalent of a SQL LIKE filter in Power Query. Unfortunately there isn’t a function to do this in the standard library but, as always, it is possible to write some M code to do this. Here’s what I came up while I was waiting around at the stables during my daughter’s horse-riding lesson. At the moment it only supports the % wildcard character; also I can’t guarantee that it’s the most efficient implementation or indeed 100% bug-free, but it seems to work fine as far as I can see…
Using the following test data:
I can run the following query:
And get this output:
You can download the sample workbook here.
I know the Power Query team have been asked for this several times already, but it would be really useful if we could package up functions like this and make it easy to share them publicly with other Power Query users…
My last post on the new Power BI features announced at the PASS BA Conference was not much more than a list of bullet points written during the keynote. Now that the conference is over and I’ve had a bit more time to reflect, I thought I would try to come to some conclusions about what they all mean. And, of course, indulge in some wild speculation too.
Microsoft’s new-found enthusiasm for working on multiple platforms is clear from two things: first, the announcement that the iOS Power BI app will be coming soon (though we should not forget that this was promised a long, long time ago and is very late), followed by native apps for other mobile platforms; and secondly the work that has gone into the HTML 5 version of Power View. Indeed, the latter was demonstrated using Google Chrome to underline the point. Of course this is the only commercially sane direction to take but it’s very welcome nonetheless.
Power View new features
The cool new Data Exploration features in Power View, which allow you to edit existing Power View reports – creating new graphs and tables and merging existing ones – are I think only going to be available in HTML 5 Power View running inside a Power BI site. The same goes for the new time series forecasting functionality. The demos also seemed to make a point of the touch-friendly interface. Now I can’t imagine that Excel Power View will move from Silverlight to HTML 5 any time soon (the Office team are notoriously conservative when it comes to big changes like this) so maybe we should assume that, going forward, the main focus will be the use of Power View inside a Power BI site to build reports and dashboards rather than Power View inside Excel on the desktop? Maybe the HTML 5 version of Power View will be what is used in the touch-optimised version of Office that is slated to appear this summer? Who knows. I liked what I saw though and these additions will go a long way towards boosting Power View’s credibility as a client tool.
Time series forecasting
It seemed like Microsoft had abandoned ‘data mining for the masses’ after SSAS data mining failed to take off, but clearly not. The time series forecasting functionality seems very easy to use (you can read more about it here and here) and I got the impression that other algorithms might be added soon – maybe Project Sage is relevant here? Another question to ask is whether ease-of-use was the real reason why ‘data mining for the masses’ failed to take off the first time around? It might have been part of the reason but another factor must surely be that business users don’t trust predictions when they don’t understand how they are made, while the data scientists and statisticians are already using other tools that give them a lot more control for forecasting.
The first reaction of several people at the BA Conference (me included) to the new dashboard and KPI creation features in Power BI sites was that this was the replacement for PerformancePoint. Personally I never liked PerformancePoint much and rarely used it, and it doesn’t seem that MS has had much enthusiasm for it in recent years. I don’t think we’ll ever see it in the cloud either. Killing it off would have the added benefit of removing a client tool from a stack that has a confusing number of client tool options right now. However I got the impression that the dashboard and KPI creation features in Power BI sites, as demoed, were fairly basic and they may not be much more than widgets that can be placed on the Power BI home screen with nothing to tie them together, but I don’t know enough to judge properly. I think it might be better to think of Power View as the place to build dashboards.
Integration with old-school SQL Server BI
The ability of Power BI sites to host SSRS reports, and for Power View to connect back to SSAS on-prem, is an important bridge between what most of Microsoft’s BI customers are actually using and the new world of cloud-based Power BI that Microsoft is promoting. For me this was the biggest announcement of all. Will these customers be interested in buying Power BI licences for their users if SSRS and on-prem SSAS is all they want to use though? I don’t know, but I assume that this will also enable mobile access to SSRS and SSAS via the Power BI mobile apps, so that will be a plus for some customers, and all the new Power View functionality makes it quite an attractive web-based reporting tool for SSAS users. The per user cost of a Power BI licence might make it too expensive to buy if all you want are web-based dashboards and mobile support but customers who are already checking out Power BI for self-service will be more likely to buy because of this.
New features in Q&A were not mentioned in the keynote on Thursday, but on Friday afternoon I saw an interesting session that detailed some of the new functionality coming in Q&A later this year including the support for phrasing that is mentioned in this blog post. It’s clear that Q&A is getting a lot of love at MS and technically it is very impressive. I’m still not totally convinced that this is something people actually want or will use but I’m less cynical than I was. I also smell a lot of consultancy money in building and tuning models for Q&A if it does get popular (the session showed that there are a lot more features coming that will help with tuning). If I understand correctly, the output from Q&A is basically a Power View report which can then be edited manually if you wish; this means that Q&A should not be thought of as a standalone tool but as one of the ways that you can start to build a Power View report, and Q&A will benefit from all of the new features that are going to be added to Power View.
Apart from the time series forecasting, which is available now, ‘this summer’ was the most common response to all questions about when the new stuff would be available. Maybe all of these features will be released when the introductory pricing period for Power BI ends? Or in time for the Worldwide Partner Conference in mid July, as Jen Underwood suggests? Hopefully it won’t be in the middle of my summer holidays.