Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

Power Query Book Published!

with 8 comments

Looking for some summer holiday (or winter holiday, depending on which hemisphere you live in) reading? If so, may I suggest my new Power Query book? “Power Query for Power BI and Excel” is available now from the Apress site, Amazon.com, Amazon.co.uk and all good bookstores.

Power Query for Power BI and Excel Cover Image

It’s an introductory level book. It covers all of the stuff you can do in the UI, it has a chapter on M, and it goes into a reasonable amount of detail on more advanced topics; it is not a 500-page exhaustive guide to the product. I’ve focused on readability and teaching the fundamentals of Power Query rather than every looking at every obscure M function, but at the same time if you’ve already used Power Query I think there’ll be plenty of material in there you’ll find interesting.

Now for the bad news: the book is out-of-date already, although not by much. One of the best things about Power Query is the monthly release cycle; unfortunately that makes writing a book on it a bit of a nightmare. I started off writing in January and had to deal with lots of added functionality and changes to the UI over the next few months; I had to retake pretty much all of the screenshots as a result. The published version of the book is based on the version of Power Query that was released in early June rather than the current version. Hopefully you can forgive this – the differences are minor – but it’s a good reason to buy the book as soon as you can! I want to do a second edition in a year’s time once (if?) the release cycle slows down.

I’ve been teased a bit for blogging and teaching so much about Power Query recently, so the final thing I want to say here is why an old corporate BI/SSAS guy like me is getting so excited about a self-service ETL tool. Well, the main reason is that Power Query is a great piece of software. It does what it does very well; it does useful things rather than what the marketing guys/analysts/journalists think is hot in BI; it is easy to use but at the same time is flexible enough for the advanced user to do really complex stuff; it is updated regularly based on feedback from its users. I only wish all Microsoft software was this good… Honestly, I wouldn’t be able to motivate myself to blog and write about Power Query if I didn’t think it was cool, and even though it hasn’t been hyped in the same way as other parts of the Power BI stack it is nonetheless the part that people get excited about when I show them Power BI. It’s not just me either – every day I see positive comments like Greg Low’s here. I think it is as important, if not more important, than Power Pivot and I think it will be a massive success.

Oh, and did I mention that I’m also teaching a Power Query course in London later this year….?

Written by Chris Webb

July 12, 2014 at 3:09 pm

Posted in Books, Power BI, Power Query

8 Responses

Subscribe to comments with RSS.

  1. Congrats on your new book!

    Have a challenge for you. Say there is .csv that continuously updates to 30 days of observations.

    http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv

    Can Power Query run after forming the initial query and only update new records?

    David Hager

    July 14, 2014 at 11:01 pm

    • Hi David,

      The short answer is no, Power Query always has to reload all of the data every time. However you can get around this in the way I detail here: http://cwebbbi.wordpress.com/2013/05/13/accumulating-data-in-an-excel-table-using-data-explorer-and-powerpivot/ although I don’t think this is a particularly elegant solution and it’s not one I’d feel comfortable relying on.

      Chris

      Chris Webb

      July 14, 2014 at 11:06 pm

      • Well, then I guess I will have to pull out my trusty VBA sword and get to work.

        David Hager

        July 14, 2014 at 11:14 pm

    • There is a fairly simple method to get PowerQuery to do this work. I whipped a file together to do something like this that involves a little setup, but 3 quick and easy to expand on queries and a simple update process.
      Setup 3 queries in a book Query1, Query2, Query3. Query1 will not load to the workbook, but make it just a reference to statically named file that gets updated. The query text will be like:

      Csv.Document(File.Contents(“C:\Data 02.txt”),null,”,”,null,1252)
      // notice no need for ‘let … in’ in such short code

      Query2 is going to compare the existing (held in query 3) to the new file (query 1) I am going to explain this one last because it is the most complex and will change based on your situation.

      Query3 is going to function as alternatively a placeholder of old data for basing query2 off of, and a repository of the completed data after running Query2 is complete (a copy of the end result of Query2). Rather than reference Query2, we reference the output of query 2 as an Excel Table. The query code is simply:

      Excel.CurrentWorkbook(){[Name="Query2"]}[Content]
      //Again no need for “Let … In”.
      //NOTE: this follows the convention that the Excel output tables are given the same name as the queries on which they are based.
      //This is an important technical distinction and procedural imperative for the example)

      So, back to Query2. This query takes Query2 as given rows that should not be duplicated. The table is transformed into a list of records for compatibility with the next function with Table.ToRecords. THen We use Table.RemoveMatchingRecords on Query1 (which presumably has the duplicates), this subtracts those duplicate rows leaving only the new data. Then we simply combine Query3 with that new data. The whole thing is as short as:

      Table.Combine( { Query3, Table.RemoveMatchingRows(Query1, Table.ToRecords(Query3)) } )
      //Again a one-liner with no need for “let … in” however the line could be decomposed if desired.

      I hope this helps to anyone interested. The key take-away is that the results of a query can be considered static for downstream queries by using Excel.CurrentWorkbook on the loaded results of a query. This is also sometimes desirable when a portion of a process takes a very long time to load but can be considered static at some intermediate step to separate the bottleneck from additional downstream processing that need not suffer from the bottleneck (especially useful during query development).

      As a result, I can also make the entire thing above self referencing via

      let

      Q1 = Csv.Document(File.Contents(“C:\Data 02.txt”),null,”,”,null,1252),

      Self = Excel.CurrentWorkbook(){[Name = "Query123"]}[Content],

      CombineQ1AndSelf = Table.Combine( { Self, Table.RemoveMatchingRows(Q1, Table.ToRecords(Self)) } )
      in
      CombineQ1AndSelf //assumes query & output table both named “Query123″

      Derik

      August 7, 2014 at 5:51 pm

  2. Congrats on getting your book out there, Chris! Keep up the good work :)

    scottsen

    July 17, 2014 at 1:12 am

  3. Sample files for the book?

    Hi Chris,
    Book arrived from Amazon a couple of weeks ago … just starting to read it now.
    With other books … Marco & Alberto, Rob Collie etc there was an attached disk with example workbooks or else it was possible to download the files. I cannot find an equivalent for the Power Query book. It would make reading and understanding a lot easier if you can replicate what is going on in the book.

    How can I get my hands on the samples .. for example there is reference to 01_01_SimpleSales.csv in Chapter 1.

    Help!

    Ted

    Ted Murphy

    July 29, 2014 at 9:45 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,113 other followers

%d bloggers like this: