Archive for the ‘Excel’ Category
Microsoft has published documentation for the new DAX functions in the Excel 2016 preview here:
There’s a lot of detail, including examples (although the ConcatenateX() page isn’t live at the time of writing – but I’ve blogged about that already), so it’s well worth reading through.
Continuing my series on new DAX functions in Excel 2016, here are two more: NaturalInnerJoin() and NaturalLeftOuterJoin(). Both do pretty much what you’d expect.
Consider the following two tables in an Excel worksheet, called ColourFruit and FruitPrice:
With these tables loaded into the Excel Data Model as linked tables, the next step is to create a relationship between the tables on the Fruit column:
Both functions only work with two tables that have an active relationship between them, and both take two tables from the Excel Data Model as parameters. Once you’ve done that you can use these functions in a DAX query.
…both perform an inner join between the two tables on the Fruit column and both return the same table:
For NaturalLeftOuterJoin() the table given in the first parameter is on the left-hand side of the left outer join, so all rows from it are returned, whereas the table in the second parameter is on the right-hand side of the join so only the matching rows are returned.
This is the first of many posts on the new DAX functions that have appeared in Excel 2016 (for a full list see this post). Today: the ConcatenateX() function.
The mdschema_functions schema rowset gives the following description of this function:
Evaluates expression for each row on the table, then return the concatenation of those values in a single string result, separated by the specified delimiter
Its signature is:
CONCATENATEX(Table, Expression, [Delimiter])
It’s easier to understand what it does using a simple example though. Consider the following table on a worksheet in Excel 2016:
When you add this table to the Excel Data Model (I called the table Sales) you can add the following measure:
Purchasing Customers:= CONCATENATEX( VALUES(Sales[Customer]), Sales[Customer], "," )
If you then use this measure in a PivotTable, you see the following:
As you can see, the measure returns a comma-delimited list of all of the customers who have bought each product. Very useful…
Following on from my recent post on Power BI and Excel 2016 news, here are some more details about the new BI-related features in the Excel 2016 Preview. Remember that more BI-related features may appear before the release of Excel 2016, and that with Office 365 click-to-run significant new features can appear in between releases, so this is not a definitive list of what Excel 2016 will be able to do at RTM but a snapshot of functionality available as of March 2015 as outlined in this document and which I’ve found from my own investigations. When I find out more, or when new functionality appears, I’ll either update this post or write a new one.
Yesterday, in the original version of my post, I mistakenly said that Power Query was a native add-in in Excel 2016: that’s not true, it’s not an add-in at all, it’s native Excel functionality. Indeed you can see that there is no separate Power Query tab any more, and instead there is a Power Query section on the Data tab instead:
Obviously I’m a massive fan of Power Query so I’m biased, but I think this is a great move because it makes all the great Power Query functionality a lot easier to discover. There’s nothing to enable – it’s there by default – although I am a bit worried that users will be confused by having the older Data tab features next to their Power Query equivalents.
There are no new features for Power Query here compared to the latest version for Excel 2013, but that’s what I expected.
Excel Forecasting Functions
I don’t pretend to know anything about forecasting, but I had a brief play with the new Forecast.ETS function and got some reasonable results out of it as seen in the screenshot below:
There’s a new hammer icon on a slicer, which, when you click it, changes the way selection works. The default behaviour is the same as Excel 2013: every time you click on an item, that item is selected and any previous selection is lost (unless you were holding control or shift to multiselect). However with the hammer icon selected each new click adds the item to the previously selected items. This is meant to make slicers easier to use with a touch-screen.
Time Grouping in PivotTables
Quite a neat feature this, I think. If you have a table in the Excel Data Model that has a column of type date in it, you can add extra calculated columns to that table from within a PivotTable to group by things like Year and Month. For example, here’s a PivotTable I built on a table that contains just dates:
Right-clicking on the field containing the dates and clicking Group brings up the following dialog:
Choosing Years, Quarters and Months creates three extra fields in the PivotTable:
And these fields are implemented as calculated columns in the original table in the Excel Data Model, with DAX definitions as seen here:
Power View on SSAS Multidimensional
At-bloody-last. I haven’t installed SSAS on the VM I’m using for testing Excel 2016, but I assume it just works. Nothing new in Power View yet, by the way.
Power Map data cards
Not sure why this is listed as new in Excel 2016 when it seems to be the same feature that appeared in Excel 2013 Power Map recently:
There isn’t any obvious new functionality in the Power Pivot window, but it’s clear that the UI in general and the DAX formula editor experience in particular has been improved.
When you use fields from two Excel Data Model tables that have no relationship between them in a PivotTable, you get a prompt to either create new relationships yourself or let Excel detect the relationships:
Renaming Tables and Fields in the Power Pivot window
In Excel 2013 when you renamed tables or fields in the Excel Data Model, any PivotTables that used those objects had them deleted. Now, in Excel 2016, the PivotTable retains the reference to table or field and just displays the new name. What’s even better is that when you create a measure or a calculated column that refers to a table or column, the DAX definition of the measure or calculated column gets updated after a rename too.
There are lots of new DAX functions in this build. With the help of the mdschema_functions schema rowset and Power Query I was able to compare the list of DAX functions available in 2016 with those in 2013 and create the following list of new DAX functions and descriptions:
FUNCTION NAME DESCRIPTION DATEDIFF Returns the number of units (unit specified in Interval) between the input two dates CONCATENATEX Evaluates expression for each row on the table, then return the concatenation of those values in a single string result, separated by the specified delimiter KEYWORDMATCH Returns TRUE if there is a match between the MatchExpression and Text. ADDMISSINGITEMS Add the rows with empty measure values back. CALENDAR Returns a table with one column of all dates between StartDate and EndDate CALENDARAUTO Returns a table with one column of dates calculated from the model automatically CROSSFILTER Specifies cross filtering direction to be used in the evaluation of a DAX expression. The relationship is defined by naming, as arguments, the two columns that serve as endpoints CURRENTGROUP Access to the (sub)table representing current group in GroupBy function. Can be used only inside GroupBy function. GROUPBY Creates a summary the input table grouped by the specified columns IGNORE Tags a measure expression specified in the call to SUMMARIZECOLUMNS function to be ignored when determining the non-blank rows. ISONORAFTER The IsOnOrAfter function is a boolean function that emulates the behavior of Start At clause and returns true for a row that meets all the conditions mentioned as parameters in this function. NATURALINNERJOIN Joins the Left table with right table using the Inner Join semantics NATURALLEFTOUTERJOIN Joins the Left table with right table using the Left Outer Join semantics ROLLUPADDISSUBTOTAL Identifies a subset of columns specified in the call to SUMMARIZECOLUMNS function that should be used to calculate groups of subtotals ROLLUPISSUBTOTAL Pairs up the rollup groups with the column added by ROLLUPADDISSUBTOTAL SELECTCOLUMNS Returns a table with selected columns from the table and new columns specified by the DAX expressions SUBSTITUTEWITHINDEX Returns a table which represents the semijoin of two tables supplied and for which the common set of columns are replaced by a 0-based index column. The index is based on the rows of the second table sorted by specified order expressions. SUMMARIZECOLUMNS Create a summary table for the requested totals over set of groups. GEOMEAN Returns geometric mean of given column reference. GEOMEANX Returns geometric mean of an expression values in a table. MEDIANX Returns the 50th percentile of an expression values in a table. PERCENTILE.EXC Returns the k-th (exclusive) percentile of values in a column. PERCENTILE.INC Returns the k-th (inclusive) percentile of values in a column. PERCENTILEX.EXC Returns the k-th (exclusive) percentile of an expression values in a table. PERCENTILEX.INC Returns the k-th (inclusive) percentile of an expression values in a table. PRODUCT Returns the product of given column reference. PRODUCTX Returns the product of an expression values in a table. XIRR Returns the internal rate of return for a schedule of cash flows that is not necessarily periodic XNPV Returns the net present value for a schedule of cash flows
Plenty of material for future blog posts there, I think – there are lots of functions here that will be very useful. I bet Marco and Alberto are excited…
It looks like we have support for the Excel Data Model (aka Power Pivot) in VBA at last. I need to do some research here, but I get the distinct feeling that the only things that are possible through VBA are the things you can do in the Excel ribbon, such as creating connections, tables and relationships. I can’t see any support for creating measures, calculated columns or hierarchies…? I can’t see anything relating to Power Query either. Maybe I’m not looking in the right place; maybe something will come in a later build?
UPDATE: I’m an idiot – there is one minor change to the VBA support for the Excel Data Model, but actually almost everything that I see in 2016 is also present in 2013. Sorry…
There have been quite a few Power BI and Office BI-related announcements over the last few weeks, and while I’ve tweeted about them (I’m @Technitrain if you’re not following me already) I though it would be a good idea to summarise them all in one post.
Power BI Announcements at Convergence and SQLBits
You’ve probably already seen the announcement today on the Power BI blog that Power BI is FINALLY available to those of us outside the USA:
At last! I’m sure MS had very good reasons why they couldn’t make the Power BI Preview available worldwide back in December, but this decision caused a lot of frustration in the MS BI community and I hope it’s not something that happens again. I can also confirm that the Power BI iPhone app is now available in the UK as well. The new data sources for Power BI that are coming soon – especially Google Analytics – will be very popular I think.
While I’m on the topic of Power BI, a few interesting nuggets about upcoming functionality emerged at SQLBits last week. Kasper mentioned that there will be some new DAX functions appearing in Power BI soon: Median, Percentile, DateDiff and XPNV. Presumably they will appear when we get the ability to create DAX measures and calculated columns in the Power BI Dashboard Designer. Also, following on from the bidirectional relationships functionality I blogged about earlier this year, there was the news that Power BI will also understand 1:1 relationships as well as 1:many, many:1 and many:many.
Office 2016 Preview BI Features
The Office 2016 preview went public today too:
There’s a great overview of what’s new for BI in Office 2016 here:
The main points are:
- Power Query is now a native feature of Excel 2016.
- Power View works on SSAS Multidimensional (this is only going to work on the versions of SSAS Multidimensional that support DAX queries, ie SSAS 2014 or SSAS 2012 SP2)
- New Excel forecasting functions
- Time grouping functionality in PivotTables
I’ll be writing a more detailed blog on all of this at some point soon, once I know what’s officially public and what isn’t.
The Power Query announcement is interesting because, as things stand at the moment, we’ll be able to use full Power Query, Power Pivot and Power View functionality for free in the Power BI Dashboard Designer, but in Excel the same functionality is restricted to users of the Professional Plus SKUs. This is crazy, and I hope Microsoft makes the Power add-ins available for every SKU of Excel 2016. Have you signed the petition for this yet?
Last week the Power Map team released a new video showcasing functionality from an upcoming release:
Although there are no details about what is shown in the video, it certainly looks like the ability to use custom shapes (the main missing feature in Power Map up to now) will be coming soon.
I wonder if I can justify buying one for demo purposes?
It’s a general rule of the internet that, whenever you have a cool idea, a few minutes spent on your favourite search engine reveals that someone else has had the idea before you. In my case, when I first saw the functionality in Power Query for working with binary files I wondered whether it was possible to read the contents of a file containing an image and render each pixel as a cell in a worksheet – and of course, it has already been done and done better than I could ever manage. However, it hasn’t been done in Power Query… until now.
First of all, I have to acknowledge the help of Matt Masson whose blog post on working with binary data in Power Query provided a number of useful examples. I also found this article on the bmp file format invaluable.
Second, what I’ve done only works with monochrome bmp files. I could have spent a few more hours coming up with the code to work with other file types but, frankly, I’m too lazy. I have to do real work too, you know.
So let’s see how this works. Here’s a picture of Fountains Abbey that I took on my phone while on holiday last summer:
I opened it in Paint and saved it as a monochrome bmp file:
Here’s the code for the Power Query query that opens the bmp file and renders the contents in Excel:
let //The picture to load SourceFilePath="C:\Users\Chris\Pictures\FountainsAbbey.bmp", //Or get the path from the output of a query called FileName //SourceFilePath=FileName, //Load the picture SourceFile=File.Contents(SourceFilePath), //First divide the file contents into two chunks: //the header of the file, always 62 bytes //and the rest, which contains the pixels //Define the format as a record OverallFormat=BinaryFormat.Record([ Header = BinaryFormat.Binary(62), Pixels = BinaryFormat.Binary() ]), //Load the data into that format Overall = OverallFormat(SourceFile), //Get the header data HeaderData = Overall[Header], //Extract the total file size and //width and height of the image HeaderFormat = BinaryFormat.Record([ Junk1 = BinaryFormat.Binary(2), FileSize = BinaryFormat.ByteOrder( BinaryFormat.SignedInteger32, ByteOrder.LittleEndian), Junk2 = BinaryFormat.Binary(12), Width = BinaryFormat.ByteOrder( BinaryFormat.SignedInteger32, ByteOrder.LittleEndian), Height = BinaryFormat.ByteOrder( BinaryFormat.SignedInteger32, ByteOrder.LittleEndian), Junk3 = BinaryFormat.Binary() ]), HeaderValues = HeaderFormat(HeaderData), FileSize = HeaderValues[FileSize], ImageWidth = HeaderValues[Width], ImageHeight = HeaderValues[Height], //Each pixel is represented as a bit //And each line is made up of groups of four bytes BytesPerLine = Number.RoundUp(ImageWidth/32)*4, //Read the pixel data into a list PixelListFormat = BinaryFormat.List( BinaryFormat.ByteOrder( BinaryFormat.Binary(BytesPerLine), ByteOrder.LittleEndian)), PixelList = PixelListFormat(Overall[Pixels]), //Convert each byte to a number PixelListNumbers = List.Transform(PixelList, each Binary.ToList(_)), //A function to convert a number into binary //and return a list containing the bits GetBinaryNumber = (ValueToConvert as number) as list => let BitList = List.Generate( ()=>[Counter=1, Value=ValueToConvert], each [Counter]<9, each [Counter=[Counter]+1, Value=Number.IntegerDivide([Value],2)], each Number.Mod([Value],2)), BitListReversed = List.Reverse(BitList) in BitListReversed, //A function to get all the bits for a single line //in the image GetAllBitsOnLine = (NumberList as list) => List.FirstN( List.Combine( List.Transform(NumberList, each GetBinaryNumber(_) ) ), ImageWidth), //Reverse the list - the file contains the pixels //from the bottom up PixelBits = List.Reverse( List.Transform(PixelListNumbers, each GetAllBitsOnLine(_))), //Output all the pixels in a table OutputTable = #table(null, PixelBits) in OutputTable
The output of this query is a table containing ones and zeroes and this must be loaded to the worksheet. The final thing to do is to make the table look like a photo by:
- Hiding the column headers on the table
- Using the ‘None’ table style so that there is no formatting on the table itself
- Hiding the values in the table by using the ;;; format (see here for more details)
- Zooming out as far as you can on the worksheet
- Resizing the row heights and column widths so the image doesn’t look too squashed
- Using Excel conditional formatting to make the cells containing 0 black and the cells containing 1 white:
Here’s the photo rendered as cells in the workbook:
And here it is again, zoomed in a bit so you can see the individual cells a bit better:
You can download the workbook (which I’ve modified so you can enter the filename of your bmp file in a cell in the worksheet, so you don’t have to edit the query – but you will have to turn Fast Combine on as a result) from here. Have fun!
Anyone who has tried to do any serious work with Power Pivot and Power Query will know about this problem: you use Power Query to load some tables into the Data Model in Excel 2013; you make some changes in the Power Pivot window; you then go back to Power Query, make some changes there and you get the dreaded error
We couldn’t refresh the table ‘xyz’ from the connection ‘Power Query – xyz’. Here’s the error message we got:
COM Error: Microsoft.Mashup.OleDbProvider; The query ‘xyz’ or one of its inputs was modified in Power Query after this connection was added. Please disable and re-enable loading to the Data Model for this query..
This post has a solution for the same problem in Excel 2010, but it doesn’t work for Excel 2013 unfortunately. There is a lot of helpful information out there on the web about this issue if you look around, though, and that’s why I thought it would be useful to bring it all together into one blog post and also pass on some hints and tips about how to recover from this error if you get it. This is the single biggest source of frustration among the Power Query users I speak to; a fix for it is being worked on, and I hope it gets released soon.
Why does this problem occur? Let’s take a simple repro.
- Import the data from a table in SQL Server using Power Query. Load it into the Excel Data Model.
- Open the PowerPivot window in Excel, then create measures/calculated fields, calculated columns, relationships with other tables as usual.
- Go back to the worksheet and build a PivotTable from data in this table, using whatever measures or calculated columns you have created.
- Go back to the PowerPivot window and rename one of the columns there. The column name change will be reflected in the PivotTable and everything will continue to work.
- Re-open the Power Query query editor, and then rename any of the columns in the table (not necessarily the one you changed in the previous step). Close the query editor window and when the query refreshes, bang! you see the error above. The table in the Excel Data Model is unaffected, however, and your PivotTable continues to work – it’s just that now you can’t refresh the data any more…
- Do what the error message suggests and change the Load To option on the Power Query query, unchecking the option to load to the Data Model. When you do this, on the very latest build of Power Query, you’ll see a “Possible Data Loss” warning dialog telling you that you’ll lose any customisations you made. Click Continue, and the query will be disabled. The destination table will be deleted from your Excel Data Model and your PivotTable, while it will still show data, will be frozen.
- Change the Load To option on the query to load the data into the Excel Data Model again. When you do this, and refresh the data, the table will be recreated in the Excel Data Model. However, your measures, calculated columns and relationships will all be gone. What’s more, although your PivotTable will now work again, any measures or calculated columns you were using in it will also have gone.
- Swear loudly at your computer and add all the measures, calculated columns and relationships to your Data Model all over again.
So what exactly happened here? The important step is step 4. As Miguel Llopis of the Power Query team explains here and here, when you make certain changes to a table in the Power Pivot window the connection from your Power Query query to the Excel Data Model goes into ‘read-only’ mode. This then stops Power Query from making any subsequent changes to the structure of the table.
What changes put the connection to the Excel Data Model in ‘read-only’ mode?
Here’s a list of changes (taken from Miguel’s posts that I linked to above) that you can make in the PowerPivot window that put the connection from your query to the Data Model into ‘read-only’ mode:
- Edit Table Properties
- Column-level changes: Rename, Data type change, Delete
- Table-level changes: Rename, Delete
- Import more tables using Power Pivot Import Wizard
- Upgrade existing workbook
How can you tell whether my connection is in ‘read-only’ mode?
To find out whether your connection is in ‘read-only’ mode, go to the Data tab in Excel and click on the Connections button. Then, in the Workbook Connections dialog you’ll see the connection from Power Query to the Data Model listed – it will be called something like ‘Power Query – Query1’ and the description will be ‘Connection to the Query1 query in the Data Model’. Select this connection and click on the Properties button. When the Connection Properties dialog opens, go to the Definition tab. If the connection is in read-only mode the properties will be greyed out, and you’ll see the message ‘Some properties cannot be changed because this connection was modified using the PowerPivot Add-In’. If you do see this message, you’re already in trouble!
How to avoid this problem
Avoiding this problem is pretty straightforward: if you’re using Power Query to load data into the Excel Data Model, don’t make any of the changes listed above in the PowerPivot window! Make them in Power Query instead.
How to recover from this problem
But what if your connection is already in ‘read-only’ mode? There is no magic solution, unfortunately, you are going to have to rebuild your model. However there are two things you can do to reduce the amount of pain you have to go through to recreate your model.
First, you can use the DISCOVER_CALC_DEPENDENCY DMV to list out all of your measure and calculated column definitions to a table in Excel. Here’s some more information about the DMV:
To use this, all you need to do is to create a DAX query table in the way Kasper shows at the end of this post, and use the query:
select * from $system.discover_calc_dependency
Secondly, before you disable and re-enable your Power Query query (as in step 6 above), install the OLAP PivotTable Extensions add-in (if you don’t already have it) and use its option to disable auto-refresh on all of your PivotTables, as described here:
Doing this prevents the PivotTables from auto-refreshing when the table is deleted from the Data Model when you disable the Power Query query. This means that they remember all of their references to your measures and calculated columns, so when you have recreated them in your Data Model (assuming that all of the names are still the same) and you re-enable auto-refresh the PivotTables will not have changed at all and will continue to work as before.
[After writing this post, I realised that Barbara Raney covered pretty much the same material in this post: http://www.girlswithpowertools.com/2014/06/power-query-refresh-fails/ . I probably read that post when it was published and then forgot about it. I usually don’t blog about things that other people have already blogged about, but since I’d already done the hard work and the tip on using OLAP PivotTable Extensions is new, I thought I’d post anyway. Apologies…]