Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Microsoft BI and SQL Server Courses For 2015

with 2 comments

The Technitrain course schedule for 2015 has now been finalised, so if you’re looking for top quality Microsoft BI and SQL Server classroom-based training in central London why not check out what we’ve got coming up?

Written by Chris Webb

January 21, 2015 at 12:04 pm

Posted in Technitrain

Rendering Images In An Excel Worksheet With Power Query Using Cells As Pixels

with 4 comments

It’s a general rule of the internet that, whenever you have a cool idea, a few minutes spent on your favourite search engine reveals that someone else has had the idea before you. In my case, when I first saw the functionality in Power Query for working with binary files I wondered whether it was possible to read the contents of a file containing an image and render each pixel as a cell in a worksheet – and of course, it has already been done and done better than I could ever manage. However, it hasn’t been done in Power Query… until now.

First of all, I have to acknowledge the help of Matt Masson whose blog post on working with binary data in Power Query provided a number of useful examples. I also found this article on the bmp file format invaluable.

Second, what I’ve done only works with monochrome bmp files. I could have spent a few more hours coming up with the code to work with other file types but, frankly, I’m too lazy. I have to do real work too, you know.

So let’s see how this works. Here’s a picture of Fountains Abbey that I took on my phone while on holiday last summer:


I opened it in Paint and saved it as a monochrome bmp file:


Here’s the code for the Power Query query that opens the bmp file and renders the contents in Excel:

   //The picture to load
   //Or get the path from the output of a query called FileName
   //Load the picture

   //First divide the file contents into two chunks:
   //the header of the file, always 62 bytes
   //and the rest, which contains the pixels

   //Define the format as a record
    Header = BinaryFormat.Binary(62),
    Pixels = BinaryFormat.Binary()
   //Load the data into that format
   Overall = OverallFormat(SourceFile),
   //Get the header data
   HeaderData = Overall[Header],

   //Extract the total file size and
   //width and height of the image
   HeaderFormat = BinaryFormat.Record([
    Junk1 = BinaryFormat.Binary(2),
    FileSize = BinaryFormat.ByteOrder(
    Junk2 = BinaryFormat.Binary(12),
    Width = BinaryFormat.ByteOrder(
    Height = BinaryFormat.ByteOrder(
    Junk3 = BinaryFormat.Binary()
   HeaderValues = HeaderFormat(HeaderData),
   FileSize = HeaderValues[FileSize],
   ImageWidth = HeaderValues[Width],
   ImageHeight = HeaderValues[Height],
   //Each pixel is represented as a bit
   //And each line is made up of groups of four bytes
   BytesPerLine = Number.RoundUp(ImageWidth/32)*4,
   //Read the pixel data into a list
   PixelListFormat = BinaryFormat.List(
   PixelList = PixelListFormat(Overall[Pixels]),
   //Convert each byte to a number
   PixelListNumbers = List.Transform(PixelList, each Binary.ToList(_)),

   //A function to convert a number into binary
   //and return a list containing the bits
   GetBinaryNumber = (ValueToConvert as number) as list =>
     BitList = List.Generate(
      ()=>[Counter=1, Value=ValueToConvert], 
      each [Counter]<9, 
      each [Counter=[Counter]+1, 
      each Number.Mod([Value],2)),
     BitListReversed = List.Reverse(BitList)

   //A function to get all the bits for a single line
   //in the image
   GetAllBitsOnLine = (NumberList as list) => 
      List.Transform(NumberList, each GetBinaryNumber(_)
    ), ImageWidth),

   //Reverse the list - the file contains the pixels
   //from the bottom up
   PixelBits = List.Reverse(
    each GetAllBitsOnLine(_))),

   //Output all the pixels in a table
   OutputTable = #table(null, PixelBits)


The output of this query is a table containing ones and zeroes and this must be loaded to the worksheet. The final thing to do is to make the table look like a photo by:

  • Hiding the column headers on the table
  • Using the ‘None’ table style so that there is no formatting on the table itself
  • Hiding the values in the table by using the ;;; format (see here for more details)
  • Zooming out as far as you can on the worksheet
  • Resizing the row heights and column widths so the image doesn’t look too squashed
  • Using Excel conditional formatting to make the cells containing 0 black and the cells containing 1 white:image


Here’s the photo rendered as cells in the workbook:


And here it is again, zoomed in a bit so you can see the individual cells a bit better:


You can download the workbook (which I’ve modified so you can enter the filename of your bmp file in a cell in the worksheet, so you don’t have to edit the query – but you will have to turn Fast Combine on as a result) from here. Have fun!

Written by Chris Webb

January 20, 2015 at 5:11 pm

Posted in Excel, Power Query

A Closer Look At Power Query/SSAS Integration

with 3 comments

In the November release of Power Query the most exciting new feature was the ability to connect to SSAS. I blogged about it at the time, but having used it for a month or so now I thought it was worth writing a more technical post showing how it works in more detail (since some things are not immediately obvious) as well as to see what the MDX it generates looks like.

This post was written using Power Query version 2.18.3874.242, released January 2015; some of the bugs and issues mentioned here will probably be fixed in later versions.

Connecting to SSAS

Power Query officially supports connecting to all versions of SSAS from 2008 onwards, although I’ve heard from a lot of people they have had problems getting the connection working. Certainly when I installed the version of Power Query with SSAS support in on my laptop, which has a full install of SQL Server 2014, it insisted I install the 2012 version of ADOMD.Net before it would work (I also needed to reboot). My guess is that if you’re having problems connecting you should try doing that too; ADOMD.Net 2012 is available to download in the SQL Server 2012 Feature Pack.

After clicking From Database/From SQL Server Analysis Services the following dialog appears, asking you to enter the name of the server you want to connect to.


If this is the first time you’re connecting to SSAS the following dialog will appear, asking you to confirm that you want to use Windows credentials to connect.


Unfortunately, if you’re connecting via http and need to enter a username and password you won’t be able to proceed any further. I expect this problem will be fixed soon.

Initial Selection

Once you’ve connected the Navigator pane appears on the right-hand side of the screen. Here you see all of the databases on the server you’ve connected to; expand a database and you see the cubes, and within each cube you see all of the measure groups, measures, dimensions and hierarchies.


The previous build of Power Query does not display any calculated measures that aren’t associated with measure groups (using the associated_measure_group property); this has been fixed in version 2.18.3874.242.

When you start to select measures and hierarchies the name of the cubes you have chosen items from will appear in the Selected items box. If you hover over the name of the cube the peek pane will appear and you’ll see a preview of the results of the query.


At this point you can either click the Load button to load the data either to the worksheet or the Excel Data Model, or click the Edit button to edit the query further.

You cannot specify your own MDX query to use for the query as yet.

The Query Editor

Once the Power Query Query Editor opens you’ll see the output of the query as it stands, and also on the Cube tab in the ribbon two new buttons: Add Items and Collapse Columns.


Here’s the MDX (captured from Profiler) showing the MDX generated for the query in the screenshot above:

{[Measures].[Internet Sales Amount],[Measures].[Internet Order Quantity]}
on 0,
[Date].[Calendar Year].[Calendar Year].allmembers
,{[Measures].[Internet Sales Amount],[Measures].[Internet Order Quantity]})
properties member_caption,member_unique_name 
on 1 
from [Adventure Works]


The MDX Subset() function is used here to ensure that the query doesn’t return more than 50 rows.

Adding Items

Clicking on the Add Items button allows you to add extra hierarchies and measures to the query. When you click the button the following dialog appears where you can choose what you want to add:


In this case I’ve added the Day Name hierarchy to the query, and this hierarchy appears as a new column on the right-hand edge of the query after the measures:


You can easily drag the column to wherever you want it though.

Here’s the MDX again:

{[Measures].[Internet Sales Amount],[Measures].[Internet Order Quantity]}
on 0,
[Date].[Calendar Year].[Calendar Year].allmembers,
[Date].[Day Name].[Day Name].allmembers)
,{[Measures].[Internet Sales Amount],[Measures].[Internet Order Quantity]})
properties member_caption,member_unique_name 
on 1 
from [Adventure Works]


Collapsing Columns

Selecting the Day Name column and then clicking the Collapse Columns button simply rolls back to the previous state of the query. However, there’s more to this button than meets the eye. If you filter the Day Name column (for example, by selecting Saturday and Sunday as in the screenshot below) and then click Collapse and Remove, the filter will still be applied to the query even though the Day Name column is no longer visible.


Here’s what the Query Editor shows after the filter and after the Day Name column has been collapsed:


Compare the measure values with those shown in the original query – it’s now showing values only for Saturdays and Sundays, although that’s not really clear from the UI. Here’s the MDX generated to prove it – note the use of the subselect to do the filtering:

{[Measures].[Internet Sales Amount],[Measures].[Internet Order Quantity]}
on 0,
[Date].[Calendar Year].[Calendar Year].allmembers
,{[Measures].[Internet Sales Amount],[Measures].[Internet Order Quantity]})
properties member_caption,member_unique_name 
on 1 
({[Date].[Day Name].&[7],[Date].[Day Name].&[1]})
on 0 
[Adventure Works])


From studying the MDX generated I can tell that certain other operations such as sorting and filtering the top n rows are folded back to SSAS.

It’s also important to realise that using the Remove option to remove a column from the query does not have the same effect as collapsing the column:


Using Remove just hides the column; the number of rows returned by the query remains the same.


User Hierarchies

In the examples above I’ve only used attribute hierarchies. User hierarchies aren’t much different – you can select either an individual level or the entire hierarchy (which is the same as selecting all of the levels of the hierarchy).



Parent-Child Hierarchies

Parent-child hierarchies work very much like user hierarchies, except that you will see some null values in columns to accommodate leaf members at different levels:


M Functions

There are a lot of M functions relating to cube functionality, although the documentation in the Library Specification document is fairly basic and all mention of them disappeared from the online help a month or so ago for some reason. Here’s the code for the query in the Collapsing Columns section above:

    Source = AnalysisServices.Databases("localhost"),
    #"Adventure Works DW 2008" = Source{[Name="Adventure Works DW 2008"]}[Data],
    #"Adventure Works1" = #"Adventure Works DW 2008"{[Id="Adventure Works"]}[Data],
    #"Adventure Works2" = #"Adventure Works1"{[Id="Adventure Works"]}[Data],
    #"Added Items" = Cube.Transform(#"Adventure Works2", {
             "[Date]", {"[Date].[Calendar Year].[Calendar Year]"}, {"Date.Calendar Year"}}, 
             {Cube.AddMeasureColumn, "Internet Sales Amount", 
             "[Measures].[Internet Sales Amount]"}, 
             {Cube.AddMeasureColumn, "Internet Order Quantity", 
             "[Measures].[Internet Order Quantity]"}}),
    #"Added Items1" = Cube.Transform(#"Added Items", {
              {Cube.AddAndExpandDimensionColumn, "[Date]", 
             {"[Date].[Day Name].[Day Name]"}, {"Date.Day Name"}}}),
    #"Filtered Rows" = Table.SelectRows(#"Added Items1", each (
             Cube.AttributeMemberId([Date.Day Name]) = "[Date].[Day Name].&[7]" 
             meta [DisplayName = "Saturday"] 
             Cube.AttributeMemberId([Date.Day Name]) = "[Date].[Day Name].&[1]" 
             meta [DisplayName = "Sunday"])),
    #"Collapsed and Removed Columns" = Cube.CollapseAndRemoveColumns(
             #"Filtered Rows", 
             {"Date.Day Name"})
    #"Collapsed and Removed Columns"

It’s comprehensible but not exactly simple – yet another example of how difficult it is to shoe-horn multidimensional concepts into a tool that expects to work with relational data (see also SSRS). I doubt I’ll be writing any M code that uses these functions manually.

Written by Chris Webb

January 13, 2015 at 11:21 pm

Multiselect, Filtering And Functions In Power Query

with 4 comments

If you’re a Power Query enthusiast you’re probably already comfortable with creating functions and passing values to them. However in some scenarios you don’t want to pass just a single value to a parameter, you want to pass multiple values – for example if you are filtering a table by multiple criteria. What’s the best way of handling this in Power Query?

Imagine that you wanted to import data from the DimDate table in the SQL Server Adventure Works DW database. It’s a pretty straightforward date dimension table as you can see:


Imagine also that you didn’t want to import all the rows from the table but just those for certain days of the week that the user selects (filtering on the EnglishDayNameOfWeek column).

The first problem is, then, how do you allow the user to make this selection in a friendly way? I’ve already blogged about how the function parameter dialog can be made to show ‘allowed’ selections (here and here) but this only allows selection of single values. One solution I’ve used is to create an Excel table – sourced from a Power Query query of course – and then let users select from there.

In this case, the following query can be used to get all the distinct day names:

    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Removed Other Columns" = Table.SelectColumns(dbo_DimDate,
																							{"DayNumberOfWeek", "EnglishDayNameOfWeek"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",
                                    {{"DayNumberOfWeek", Order.Ascending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows"),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Selected", each "No")
    #"Added Custom"

Nothing much interesting to say about this apart from that it was all created in the UI, it shows the day names in the correct order, and it has an extra column called Selected that always contains the value “No”. The output table in Excel looks like this:


The Selected column is going to allow the end user to choose which days of the week they want to filter the main table by. Since “Yes” and “No” are going to be the only valid values in this column you can use Excel’s Data Validation functionality to show a dropdown box in all of the cells in this column that allows the user from selecting one of those two values and nothing else.



Once the user has selected “Yes” against all of the day names they want to filter by in the Excel table, the next step is to use this table as the source for another Power Query query. To be clear, we’ve used Power Query to load a table containing day names into an Excel table, where the user can select which days they want to filter by, and we then load this data back into Power Query. This second query (called SelectedDays in this example) then just needs to filter the table so it only returns the rows where Selected is “Yes” and then removes the Selected column once it has done that:

    Source = Excel.CurrentWorkbook(){[Name="DistinctDates"]}[Content],
    #"Filtered Rows" = Table.SelectRows(Source, each ([Selected] = "Yes")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Selected"})
    #"Removed Columns"


This query doesn’t need to be loaded anywhere – but it will be referenced later.

With that done, you need to create a function to filter the DimDate table. Here’s the M code:

(SelectedDays as list) =>
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Filtered Rows" = Table.SelectRows(dbo_DimDate, 
                             each List.Contains(SelectedDays,[EnglishDayNameOfWeek]) ) 
    #"Filtered Rows"

The thing to notice here is the condition used in the Table.SelectRows() function, where List.Contains() is used to check whether the day name of the current row is present in the list passed in through the SelectedDays parameter.

The final step is to invoke this function and pass the column from the query containing the selected days to it. There is a bit of UI sugar when you invoke a function with a parameter of type list that I blogged about recently. In this case when you invoke the function you just have to pass the pass it the EnglishDayNameOfWeek column from the SelectedDays query.


Here’s what the code for the query that invokes the function looks like:

    Source = DimDate(SelectedDays[EnglishDayNameOfWeek])

And of course, when you run the query and output the results to a table, you get the DimDate table filtered by all of the days of the week you have selected:


To change the output the user just needs to change the selected days and then refresh this last query.

In case you’re wondering, this query does get folded back to SQL Server too. Here’s the SQL generated by Power Query:

select [$Ordered].[DateKey],
    select [_].[DateKey],
    from [dbo].[DimDate] as [_]
    where [_].[EnglishDayNameOfWeek] in ('Monday', 'Wednesday', 'Friday')
) as [$Ordered]
order by [$Ordered].[DateKey]

Notice that the Where clause contains an IN condition with all of the selected days.

You can download the example workbook for this post here.

Written by Chris Webb

January 8, 2015 at 9:53 am

Posted in Power Query

Bidirectional Relationships And Many-To-Many In The Power BI Designer

with 16 comments

There’s a lot of cool stuff in the new Power BI Designer desktop app, but for me the most important new bit of functionality is one that’s not immediately obvious: relationships between tables in the data model have had a significant upgrade. Let me illustrate…

Bidirectional Relationships

First up: relationships can now filter in two directions. Consider the following two tables in an Excel workbook, a dimension table called Fruit and a fact table called Sales:


When you first load these tables into the Power BI Designer no relationships are created between the tables. To create relationships you need to click on the Manage button on the Home tab so that the Manage Relationships dialog appears. You can then click the Autodetect button and the relationship between the two FruitID columns is created.


However, click on the Edit button and you’ll see something interesting. In the Edit Relationship dialog, under Advanced options, you’ll see that the Cross filter direction is set to Both (the other option is Single).


This means that, not only can you create a report like this with FruitName field on rows axis of a table in a Power View report along with a measure showing the sum of values from the Sales field:


But you can also now take the Date field from the Sales table and put it on rows in the report along with a measure showing the distinct count of values from the Fruit Name field from the Fruit table:


The relationship between the two tables is working in both directions, from the dimension table to the fact table and from the fact table to the dimension table, which is a big change from Power Pivot in Excel where a relationship can only work in one direction (from the dimension table to the fact table). You can still get the original Power Pivot relationship behaviour by setting the Cross filter direction property to Single.


If you thought that was impressive, there’s another implication of this change: many-to-many relationships now work automatically. No nasty DAX is necessary – which is lucky because, at the time of writing, there’s nowhere to use DAX in the Power BI Designer. Here’s the same data as above but with two more tables, so that there is now a classic many-to-many model with a dimension table called Group and a factless fact table called GroupToFruit associating each fruit with one or more groups and each group with one or more fruit.


Here are the relationships in the model, all of which were created using the Autodetect button and all of which have their Cross filter direction set to Both:


And here’s what you see in a report when you put GroupName on rows with a measure showing the sum of Sales:


The sales value for Berries is 35, the sum of the sales for Raspberries and Strawberries; the sales value for Red Fruit is the same because that group contains the same fruit; but the grand total is not the sum of the groups but the total sales for all fruit.


Anyone that has tried to build a reasonably complex model in Power Pivot or SSAS Tabular will understand how big a change this is. Up to now if you wanted to use many-to-many relationships you needed to add extra DAX code to each measure you created, and that added an unwelcome layer of complexity; now it just works. I haven’t thought it through properly yet but I bet that many other modelling scenarios can now be solved with this new functionality too. Time to do some thinking…

Written by Chris Webb

January 1, 2015 at 6:29 pm

Posted in Power BI

10th Blog Birthday

with 10 comments

Earlier this year I celebrated 1000 posts on this blog; now it’s time to celebrate passing another milestone: ten years since my first ever post. Thanks to everyone who has been with me since then!

It’s my habit to post a review of the past year on this date, and as always there’s a lot to think about. This has been the first year where the majority of my posts have not been on SSAS or MDX. Most of my consultancy and training is still on these topics but given the lack of new features in SSAS recently it’s become harder and harder to find anything new to say about it (although a few other bloggers have managed to, such as Richard Lee’s great posts on using PowerShell to automate various SSAS administrative tasks). On the other hand I’ve invested a lot of time learning Power Query and as a result I’ve found a lot to write about, and this is true even after having written a book on it. I really hope that SSAS gets some attention from Microsoft soon – I’ve come to accept that I won’t see anything new in MDX, and the same is probably true of Multidimensional, but Tabular and DAX should get a major upgrade in SQL Server (whenever that comes). Given the strong ties between SSAS Tabular, Power Pivot and now the Power BI Dashboard Designer I would guess that we’ll see new Tabular/DAX features appearing in the Power BI Designer in the coming months, and then later on in Excel and SSAS. When that happens I’ll be sure to write about them.

In the meantime, why the focus on Power Query? It’s not just to have something to blog about. If you’re a regular reader here you’ll know that I’m very enthusiastic about it and it’s worth me explaining why:

  • It solves a significant problem for a lot of people, that of cleaning and transforming data before loading into Excel. My feeling is that more people need Power Query for this than need Power Pivot for reporting.
  • More importantly, it’s a great product. It works well, it’s easy to use and I’m constantly being surprised at the types of problem it can solve. Indeed, where there’s an overlap between what it can do and what Power Pivot can do, I think users will prefer to work with Power Query: its step-by-step approach is much friendlier than a monolithic, impossible-to-debug DAX expression. Whenever I show off Power Query at user groups or to my customers it generates a lot of interest, and the user base is growing all the time.
  • I love the way that the Power Query dev team have released new features on a monthly basis. The amount that they have delivered over the last 18 months has put the rest of Power BI to shame, although I understand that because Power Query isn’t integrated into Excel in the way that Power View and Power Pivot are they have a lot more freedom to deliver. What’s more important though is that the Power Query dev team make an effort to talk to their users and develop the features that they actually want and need (the ability to set the prefix when expanding columns is a great example), rather than build whatever the analysts are hyping up this year. This gives me a lot of confidence in the future of the product.
  • Having seen the way that Power Query has been integrated into the Power BI dashboard designer, it could be the case that in the future the distinctions between Power Query, Power View and Power Pivot disappear and we think of them as parts of a single product.

One other big change for me this year was that I resigned from the SQLBits committee after seven years. There’s no behind-the-scenes scandal here, I just felt like it was time for a change. I work too hard as it is and I needed to free up some time to relax and be with my family; I was also aware that I wasn’t doing a great job on it any more. It was a very tough decision to make nonetheless. I had a great time with SQLBits while I was involved with it and I’ll be at SQLBits XIII in London next March as an attendee and hopefully a speaker. I know it will be another massive success.

Looking forward to next year, I hope the new direction for Power BI will be good for partners like me. There will certainly be continued interest in training for it, but the real test will be whether there’s a lot of demand for consultancy. I’ve done some Power Pivot and Power Query consultancy work this year, and demand is definitely increasing, but it’s still not a mature market by any means. Maybe the move away from Excel will change the nature of the BI projects that people attempt with Power BI, so that there are more formal, traditional implementations as well as the ad hoc self-service use that I’m seeing at the moment. The new Power BI APIs should also encourage more complex, IT department-led projects too. I don’t have a problem with the concept of self-service BI but I think it’s a mistake to believe that all BI projects can be completely self-service. I would like to think that there’s still a need for professional services from the likes of me in the world of Power BI; if there isn’t then I’m going to need to find another career.

Anyway, I’ve probably gone on for long enough now and I need to get back to enjoying what’s left of the holidays. Best wishes to all of you for 2015!

Written by Chris Webb

December 30, 2014 at 4:27 pm

Posted in Uncategorized

Viewing Error Messages For All Rows In Power Query

with 6 comments

One of the great features of Power Query is the way you can view any rows that contain error values when you load data. However, even if you can see the rows that have errors you can’t see the error messages easily – without writing a little bit of M code, which I’ll show you in this post.

Imagine you have the following table of data:


…and you load it into Power Query using the following query, which sets the data type for the Sales column to be Whole Number:

    Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(
                                     {{"Sales", Int64.Type}, 
                                      {"Product", type text}})
    #"Changed Type"


As you’d expect, the last two rows contain error values as a result of this type conversion:


You can also see the number of rows that contain errors when you load the query:


Clicking on the “2 errors” link in the screenshot above creates a new query that only contains the rows with errors:


You can click on the Error link in any cell that contains one to see the error message:


But what you really want is to see the error message for each row. To do this add a new custom column with the following definition:

try [Sales]


You will then see a new column called Custom containing a value of type Record. You can then click the Expand icon in the column header (highlighted) and then OK:


You’ll then see another column called Custom.Error with an Expand icon; click on it and then click OK again.



And at last you’ll have two columns that show the error messages for each row:


Written by Chris Webb

December 22, 2014 at 3:00 pm

Posted in Power Query


Get every new post delivered to your Inbox.

Join 3,715 other followers