Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Comparing Columns In Power Query

with 12 comments

A few weeks ago I wrote a post about comparing the contents of entire tables in Power Query, and a question came up in the comments about how you might go about comparing values in columns rather than whole tables. Of course this prompted me to investigate how different types of comparison might be done – and here’s the blog post with the results of the investigation.

Consider the following two single-column tables in an Excel worksheet:

image

Which items are present in one column and not in the other? Which are present in both? The easiest way to answer these questions is to take each table and turn it into a List object (using Table.ToList() ); once you’ve done that you’ll find there are loads of really useful functions for this type of thing. Here’s a query that compares the values in each column:

let

    Source = Excel.CurrentWorkbook(){[Name="Source"]}[Content],

    Target = Excel.CurrentWorkbook(){[Name="Target"]}[Content],

    SourceList = Table.ToList(Source),

    TargetList = Table.ToList(Target),

    InSourceNotTarget = List.Difference(SourceList, TargetList),

    InTargetNotSource = List.Difference(TargetList, SourceList),

    InTargetAndSource = List.Intersect({SourceList, TargetList}),

    CombineWithComma = Combiner.CombineTextByDelimiter(", "),

    ResultsTable = Table.FromRows(

             {

         {"In Source but not in Target", CombineWithComma(InSourceNotTarget)},

             {"In Target but not in Source", CombineWithComma(InTargetNotSource)},

             {"In both Target and Source", CombineWithComma(InTargetAndSource)}

             },

             {"Comparison Type", "ListResult"}

             )

 

in

    ResultsTable

 

Here’s the output:

image

Fairly self-explanatory, I think. List.Difference() finds the items that are in one list and not another: List.Intersect() finds items that are in both. In fact it’s probably more interesting to look at how I’ve generated the output. Table.FromRows() returns a manually constructed table. The CombineWithComma step uses Combine.CombineTextByDelimiter() to return a function that turns all of the items in a list into a single, comma-delimited piece of text, and I then use that function inside each row of the table I’m returning to get a readable version of what List.Difference() and List.Intersect() return.

Rather than looking at the distinct values in each column, though, you might want to do a row-by-row comparison. Here’s another query that does that:

let

    Source = Excel.CurrentWorkbook(){[Name="Source"]}[Content],

    Target = Excel.CurrentWorkbook(){[Name="Target"]}[Content],

    SourceList = Table.ToList(Source),

    TargetList = Table.ToList(Target),

    PositionList = List.Positions(SourceList),

    RowComparison = List.Transform(

                      PositionList, 

                      each 

                       if 

                        SourceList{_}=TargetList{_} 

                       then 

                        {_+1, "No Change",  SourceList{_} } 

                       else 

                        {_+1, "Change", "Source: " & SourceList{_} & 

                        ", Target: " & TargetList{_} }),

    ResultsTable = Table.FromRows(RowComparison, {"Row Number", "Changed?", "Comparison"})

 

in

    ResultsTable

Here’s the output:

image

Again I’m turning each table into a list, and then I’m using List.Positions() to generate a list of integer values from 0 to 9 representing the index of each item in the source list, then using List.Transform() to iterate over each item in this list and compare the values at the given index in the source and target list.

Frankly, an even easier way of doing this might have been to import both tables in separate queries, add an index column to both of them using the Insert Index Column button, then join the two tables together on the index column using the Merge button and then finally create some custom columns to do the comparison. This is certainly how any end-user would do it, but the resulting code is a bit less elegant I didn’t learn anything interesting about M from doing it that way. I’ve left the example in the demo workbook, which you can download here.

PS Even if you have voted for me already in the Power BI competition, please vote for me again (you can vote once every 24 hours)! Here’s the link:

CLICK HERE TO VOTE FOR ME IN THE POWER BI COMPETITION! EVEN IF YOU’VE VOTED ALREADY YOU CAN DO SO AGAIN!

Written by Chris Webb

January 27, 2014 at 10:02 pm

Posted in Power Query

12 Responses

Subscribe to comments with RSS.

  1. The more I see of the power of these PowerBI tools, the more disappointed I am that that the functionality is not accessible through VBA.

    dch1dch2

    January 27, 2014 at 11:31 pm

    • I know, it would be really useful to have this. I guess it’s on the “to-do” list, though probably quite low down.

      Chris Webb

      January 27, 2014 at 11:35 pm

  2. Hi Chris,
    thanks for these insights!

    BR
    Patrick

    Patrick

    January 29, 2014 at 1:55 pm

  3. […] 4. Comparing Columns In Power Query […]

  4. Hi Chris,
    Your example is amazing and I have tried to repro it with 2 multi-column tables. I’m not able to specify in my formula ToList, the 2 columns I want to compare (or maybe it’s in the source).
    Any idea to help me ?

    Frederic Le Guen

    March 3, 2014 at 3:27 pm

    • Can you post the code for your query?

      Chris Webb

      March 3, 2014 at 3:32 pm

      • Sure, here it is
        let
        Source = Excel.CurrentWorkbook(){[Name="Table_Employee"]}[Content],
        Target = Excel.CurrentWorkbook(){[Name="Table_BML"]}[Content],
        SourceList = Table.ToList(Source),
        TargetList = Table.ToList(Target),
        InSourceNotTarget = List.Difference(SourceList, TargetList),
        InTargetNotSource = List.Difference(TargetList, SourceList),
        InTargetAndSource = List.Intersect({SourceList, TargetList}),
        CombineWithComma = Combiner.CombineTextByDelimiter(“, “),
        ResultsTable = Table.FromRows(
        {
        {“In Source but not in Target”, CombineWithComma(InSourceNotTarget)},
        {“In Target but not in Source”, CombineWithComma(InTargetNotSource)},
        {“In both Target and Source”, CombineWithComma(InTargetAndSource)}
        },
        {“Comparison Type”, “ListResult”}
        )

        in
        ResultsTable

        I want to compare 2 columns with phone number. In the first table the column name is ‘Phone Number’ and in the second table it’s ‘Caller’

        By the way, what is your Wp Add-in name to log with Facebook before to post a comment. It’s pretty cool :)

        Frederic Le Guen

        March 3, 2014 at 4:14 pm

      • I think the problem is that you are using Table.ToList to convert your whole table to a list. What you need to do is Table.Column(Source, “Phone Number”) to get a list that contains only the values from the column in the table you want.

        Chris Webb

        March 4, 2014 at 1:25 pm

  5. No, it doesn’t work :(
    If you want, I can send you the file by mail (you will find mine in the site mvp of microsoft)

    Frederic Le Guen

    March 5, 2014 at 3:24 pm

  6. Not sure what Frederic ended up doing to compare his phone number columns within two tables, but I was able to get Table.Column to work this way:

    Source = Excel.CurrentWorkbook(){[Name="MySourceTable"]}[Content],
    Target = Excel.CurrentWorkbook(){[Name="MyTargetTable"]}[Content],
    SourceList = Table.Column(Source, “phone_number”),
    TargetList = Table.Column(Target, “phone_number”),

    ginameronek

    March 14, 2014 at 9:24 pm

    • I sorted out Frederic’s problem offline – he was getting errors because his phone numbers were (unsurprisingly) numeric, and this broke the code I was using to construct the comma-delimited list for the output.

      Chris Webb

      March 14, 2014 at 9:26 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,302 other followers

%d bloggers like this: