Friday, March 4, 2011

How designing proper partitions helps in improving MDX query performance?


You can improve and enhance mdx query performance by using proper partition strategies. If your measure group contains a single partition and if a partition is very large then it’s always a good practice to split a single partition into multiple smaller partitions. If your fact table contains huge data for multiple years then you can create multiple partitions as per year. Just for a example consider your fact table contains data for 4 years i.e. from year 2002 to 2006 and you have created 4 partitions i.e. one partition for each year. Let’s say a partition named “Partition_2002” contains data for year 2002 only, a partition named “Partition_2003” contains data for year 2003 only and so on up to year 2006. Now whenever you execute MDX query to fetch data for year 2003, analysis services will scan “Partition_2003” partition only. Let’s discuss this scenario with Adventureworks sample database, adventureworks sample contains “Reseller Sales” measure group which contains 4 partitions i.e. Reseller_Sales_2001, Reseller_Sales_2002, Reseller_Sales_2003 and Reseller_Sales_2004. Reseller_Sales_2001 partition contains data up to year 2001, Reseller_Sales_2002 partition contains data for year 2002, Reseller_Sales_2003 partition contains data for year 2003 and partition Reseller_Sales_2004 contains data for year 2004. Now let’s demonstrate this with some mdx example. Before executing mdx, clear analysis services cache. For 
clearing cache, execute following Clear cache xmla statement.

<ClearCache xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Adventure Works DW 2008</DatabaseID>
  </Object>
</ClearCache>

After clearing cache, Open SQL Profiler and connect to analysis services instance. Execute following MDX query for retrieving data of [Reseller Sales Amount] for year 2003. 

SELECT
{[Measures].[Reseller Sales Amount]} ON COLUMNS
FROM [Adventure Works]
WHERE
([Date].[Calendar].[Calendar Year].&[2003])
 
After execution of mdx query if you see Profiler, you will find following text in the TextData column;


This indicates that analysis services scanned only 2003 partition i.e. Reseller_Sales_2003 partition and ignored unnecessary scan of other partitions (i.e.'Reseller_Sales_2001', 'Reseller_Sales_2002' and 'Reseller_Sales_2004'). So this means, if you split a large partition into smaller multiple partitions then analysis services scans only required partition i.e. Partitions which contain required data in the subcube are queried only, thus avoiding the cost of reading unnecessary data which in turn helps in improving MDX query performance.

Even while doing Partitioning, setting Slice property on partitions is very important. Sometimes you will find that, though you have multiple partitions (let’s say one partition for each year) but when you execute mdx query, profiler show scans of other partitions also and such behavior seems if we have not set the Slice property either. Let’s discuss this with some sample demonstration, Clear the cache and execute following mdx query;

SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
FROM [Adventure Works]
WHERE
([Date].[Calendar].[Calendar Year].&[2003])


After execution of mdx query if you see Profiler, you will find following text in the TextData column;



So though you have executed mdx query to fetch data for year 2003 only but analysis services scanned two extra partitions (i.e. partition 'Internet_Sales_2001' and 'Internet_Sales_2002').
Now set Slice property for each partition i.e. set slice property for a partition 'Internet_Sales_2001' as [Date].[Calendar].[Calendar Year].&[2001], set slice property for a partition 'Internet_Sales_2002' as [Date].[Calendar].[Calendar Year].&[2002],
Set slice property for a partition 'Internet_Sales_2003' as [Date].[Calendar].[Calendar Year].&[2003] and set slice property for a partition 'Internet_Sales_2004' as [Date].[Calendar].[Calendar Year].&[2004].


and after that do ProcessFull of “Internet Sales” measure group.
After processing,clear the cache.Open SQL profiler and execute same mdx for retrieving data of [Measures].[Internet Sales Amount] for year 2003,

SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
FROM [Adventure Works]
WHERE
([Date].[Calendar].[Calendar Year].&[2003])

Now if you notice a Profiler TextData,you will find that analysis services is scaning only 'Internet_Sales_2003' partition.


So when you split large partition into smaller partitions,always set appropriate Slice property on partitions.

Even Sometimes when you execute mdx query,analysis services may retrieve more data than required to answer future queries more efficiently. this behaviour is called Prefetching and you can turned off this by setting connection string parameter "Disable Prefetch Facts=True; Cache Ratio=1





2 comments:

  1. This is the authority of a student that how he/she makes the best usage of the given data and if still finds any problem so essay essay writers can be the best source to be used.

    ReplyDelete
  2. Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating MicroStrategy Online Training Hyderabad

    ReplyDelete