Thoughts on how PowerPivot and SSAS could work together
After yesterday’s stream of consciousness on how PowerPivot could be used in SSRS, here’s a follow-up post on how PowerPivot and ‘traditional’ SSAS could be integrated. Hold on, you say, surely that’s a no-brainer? Surely all that would need to happen would be that Vertipaq would become a new storage mode inside SSAS, along with MOLAP, ROLAP and HOLAP, and everyone would be happy? Well, maybe. But here’s alternative idea that I bounced off some friends a while back and got good feedback on, which I thought I’d air here.
Before I go on, let me state my position on some things:
- I like PowerPivot, and the more I use PowerPivot the more I like it.
- I really like the power of the Vertipaq engine, and I want to be able to use it in a corporate BI environment.
- I really like DAX, and I want to be able to use it in a corporate BI environment.
- BUT SSAS as we have it today is a very mature, rich tool that I don’t want to lose. PowerPivot models will always be a little rough-and-ready; a good SSAS cube is a lot more ‘finished’ and user-friendly (I always liken building a cube to building a UI). SSAS dimension security is, for example, an absolute killer feature in many corporate BI solutions; PowerPivot won’t have anything like this until at least the next version, whenever that will be.
- I also love MDX and I don’t want to lose it. MDX Script assignments, calculated members on non-measures dimensions, all of the things that PowerPivot and DAX can’t do (and probably won’t ever do) are things that I use regularly and in my opinion are essential for many complex, enterprise BI implementations.
- I don’t want the dev team to abandon corporate SSAS, and neither do I want the dev team to waste time re-implementing things in PowerPivot that we already have in corporate SSAS. Already people are asking when they can have security and partitioning in PowerPivot. I want new stuff though!
So, like all users I want absolutely everything possible. How could it be done? Here’s my basic idea: let us be able to build regular SSAS cubes using PowerPivot models as data sources, with SSAS working in something similar to ROLAP mode so every request for data from the cube is translated into an MDX (or SQL – remember SSAS, and presumably PowerPivot, supports a basic version of SQL) query against the PowerPivot model.
In more detail, let’s imagine we have an instance of SSAS running in Vertipaq mode and an instance of SSAS running in normal mode. You’d be able to do the following:
- Fire up BIDS and create a new SSAS project.
- Create a data source, which was a PowerPivot database on your Vertipaq instance of SSAS.
- Create a new Data Source View, which showed all of the tables in your PowerPivot database already joined. Alternatively, here I can imagine connecting to other data sources like SQL Server, creating a Data Source View as normal and then taking the DSV and deploying it as a PowerPivot model onto the Vertipaq instance of SSAS. So in effect, the DSV designer becomes a development environment for PowerPivot models.
- Create a regular SSAS cube in the usual way, only using the PowerPivot tables in the DSV.
- Set the storage mode of your dimensions and partitions to the new ROLAP-like storage mode; each SSAS partition could then be based on a separate PowerPivot table. This would mean that when you queried the cube, the SSAS instance issued MDX or SQL queries against the Vertipaq instance of SSAS, just as it issues SQL queries in ROLAP mode today. I suppose though there would be an overhead to making an out-of-process call, so maybe it would be better if you only had one instance of SSAS that could host both Vertipaq and regular SSAS databases at the same time, so all these requests could stay in-process.
The first, obvious, point here is that with this approach we get the traditional, rich SSAS cubes that we know and love and the raw speed of Vertipaq. So one objective is achieved. But I think there would be a lot of other benefits:
- You’d get two cubes for the price of one: the PowerPivot cube and the SSAS cube. You could choose which one to query depending on your needs.
- The ability to turn DSVs into PowerPivot models also gives you a proper development environment for creating PowerPivot models, integrated with BIDS and Visual Studio (so you also get source control). The current Excel-based UI is all very well, but us developer types want a nice visual way of creating relationships between tables.
- You’re able to use all of the new data sources that PowerPivot can work with in traditional SSAS. Imagine being able to create a planning and budgeting solution where users wrote values into an Excel Services spreadsheet, which then fed into PowerPivot via the new Excel Services REST API, which then in turn fed into a SSAS planning and budgeting cube complete with custom rollups and all the complex financial calculations you can only do in MDX.
- If your users have already built an existing PowerPivot model that they like and want to turn into an ‘official’ BI solution, you can very easily take that model as the starting point for building your cube by importing it into a DSV.
- It would also make it relatively easy to upgrade existing SSAS projects to use PowerPivot storage – you’d just convert your existing DSV into a PowerPivot model.
- SSAS drillthrough would be much, much faster because you’d be drilling through to the PowerPivot model and not the underlying relational source.
- You’d also have the possibility of working in something like HOLAP mode. Vertipaq may be fast, but with really large data volumes some pre-calculated aggregations are always going to be useful.
- You could define calculated measures in DAX in the PowerPivot model, and then expose them as measures in the SSAS cube. Probably you’d need some special way of handling them so they didn’t get aggregated like regular measures, but in some cases you’d want to take a calculated measure and sum it up like a regular measure (kind of like SQL calculations defined in named calculations today); many more calculations, like year-to-dates, can be treated as semi-additive measures. Effectively this means you are performing some multidimensional calculations outside the Formula Engine, in the SSAS Storage Engine (which in this case is PowerPivot), in the same way I believe that measure expressions work at the moment.
- For such additive and semi-additive calculations, it also opens up the possibility of parallelism since these calculations can be done in parallel in each partition and the result summed at the end. It also means you get the option to use either DAX or MDX, and can choose the right language for the job.
- There’s no duplication of dev work needed. For users of PowerPivot who want features like security, partitioning or parent/child relationships, you tell them they have to upgrade to regular SSAS; PowerPivot becomes something like SSAS Express. For users of SSAS who want the speed of Vertipaq, you tell them they have to use a PowerPivot database as their data source. The two complement each other nicely, rather like twins… now where have I heard that analogy before?
- You also have a convincing story for BI professionals who are sceptical/hostile to PowerPivot to win them over: traditional, corporate SSAS does not go away but is able to build on the new features of PowerPivot.
So there we have it, another fantasy on the future of the MS BI stack sketched out. You may be wondering why I’ve taken the time to write these two posts – after all, I don’t work for Microsoft and I’m sure plenty of people on the dev team have their own ideas on what features they want to implement for Denali. Well, as the saying goes, if you don’t ask you don’t get! And with Kilimanjaro almost out of the door now’s the time to ask. If you agree with what I’ve said here, or you disagree, or you have a better idea, please leave a comment…