Chris Webb's BI Blog

Analysis Services, MDX, PowerPivot, DAX and anything BI-related

PDW and SSAS

with 12 comments

One new feature of SQL Server PDW 2012 that hasn’t had the attention it deserves is the fact that it is now officially supported as a data source for Analysis Services, both Multidimensional (in ROLAP and MOLAP modes) and Tabular (in In-Memory and DirectQuery modes). If you are working with extremely large data volumes in SSAS then PDW might be something you want to do some research on. For SSAS Multidimensional in MOLAP mode or Tabular models in In-Memory mode, using PDW as a data source should make processing run extremely quickly. For SSAS Multidimensional in ROLAP mode or Tabular models in DirectQuery mode, it can give you interactive query access to data volumes that MOLAP/In-Memory simply couldn’t handle (remember though that DirectQuery only works with DAX queries, so Excel PivotTables don’t work with it, only Power View).

There are a few public sources of information on PDW/SSAS integration. One is the white paper on PDW that you can download from Henk van der Valk’s blog here:
http://henkvandervalk.com/introduction-to-sql-server-2012-parallel-data-warehouse

Here’s the relevant section:

New in SQL Server 2012 PDW, you can use PDW as a high performance relational data source for building multidimensional OR tabular models with SQL Server Analysis Services (SSAS). For example, you can:

· Use DirectQuery to perform real-time queries against a SQL Server PDW data source from a tabular model.

· Reduce ROLAP query times by using clustered columnstore indexes on SQL Server PDW tables.

· Use new features of SSAS. For example, use EnableRolapDistinctCountOnDataSource to run distinct count operations on PDW tables.

· Use PDW as a data warehouse for aggregating Hadoop data for use in multidimensional or tabular models.

The DirectQuery page in Books Online says similar things:
http://msdn.microsoft.com/en-us/library/hh230898.aspx

Some quotes:

In contrast, a tabular model in DirectQuery mode uses data that is stored in a SQL Server database, or in a SQL Server PDW data warehouse. At design time, you import all or a small sample of the data into the cache and build your model as usual. When you are ready to deploy the model, you change the operating mode to DirectQuery. After you change the operating mode, any queries against the model will use the specified relational data source (either SQL Server or SQL Server PDW), not the cached data.

DirectQuery can take advantage of provider-side query acceleration, such as that provided by xVelocity memory optimized column indexes. xVelocity columnstore indexes are provided in both SQL Server 2012 and SQL Server PDW, to support improved DirectQuery performance.

It’s a shame there isn’t more information out there though. This post from Michael Mukovskiy has some interesting findings on using PDW v1 as a data source for SSAS:
http://blog.oraylis.de/2013/07/pdw-cube-processing-experience/
…presumably PDW 2012 would give even better results.

Written by Chris Webb

January 31, 2014 at 1:19 pm

Posted in Analysis Services, PDW

12 Responses

Subscribe to comments with RSS.

  1. It would be nice having a PDW to test :-)

    Marco Russo

    January 31, 2014 at 2:07 pm

    • I have been to the PDW v2 course and the problem is that this is a product that has NDA written all over it. It is impossible to discuss shortcomings and benefits because it is not a standard product. This is not unique to MS though!

      thomasivarssonmalmo

      January 31, 2014 at 9:29 pm

      • Can you please be more specific? I have seen public discussions about PDW on several forums.

        Andrey

        February 1, 2014 at 4:57 pm

  2. Andrey. I was told that at the course one year ago and what people discuss on other forums is not interesting.

    thomasivarssonmalmo

    February 2, 2014 at 9:49 pm

  3. Reading Michael’s notes from his experience I presume that PDW v2 may give you much better results comparing to v1. Page compression and Columnstore indexes dramatically improved performance of whole appliance.

    One more to add: the improvement in ROLAP partition processing is in the first part when data is retrieved from your source (PDW in this case).

    Hubert Kobierzewski

    February 3, 2014 at 10:07 am

  4. Reblogged this on Can Atuf Kansu.

    The Newbie

    February 3, 2014 at 10:14 am

  5. For more information about PDW, have you seen the website http://www.upgradetopdw.com? There are two papers there.

    The first one gives an overview of PDW and describes the benefits of V2:

    http://download.microsoft.com/download/5/0/1/5015A62E-06BF-4DEC-B90A-37D52E279DE5/SQL_Server_2012_Parallel_Data_Warehouse_Breakthrough_Platform_White_Paper.pdf.

    The second one describes the process of migrating to PDW:

    http://download.microsoft.com/download/4/2/6/42616D71-3488-46E2-89F0-E516C10F6576/SQL_Server_to_SQL_Server_PDW_Migration_Guide.pdf

    Barbara Kess

    February 4, 2014 at 12:27 am

  6. […] and I have to lower my voice again, I have to agree with Chris Webb that there is almost no information about it out there. So enough reason to write about this truly […]

  7. […] PDW and SSAS […]

  8. ROLAP seems to be good idea for scaling SSAS. However, we see not correct behaviour in few cases,which causes performance issues. For example SSAS ‘likes’ to SUM all measures in the fact table even if you need only one and etc. Are there any tips and trics on how to implement ROLAP via MS SSAS?

    Dio

    April 16, 2014 at 8:43 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 3,302 other followers

%d bloggers like this: