Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Tuning Cubes and Queries in Analysis Services 2008 Chris Webb

Similar presentations


Presentation on theme: "Performance Tuning Cubes and Queries in Analysis Services 2008 Chris Webb"— Presentation transcript:

1 Performance Tuning Cubes and Queries in Analysis Services 2008 Chris Webb chris@crossjoin.co.uk www.crossjoin.co.uk chris@crossjoin.co.uk www.crossjoin.co.uk

2 Who am I? Chris Webb (chris@crossjoin.co.uk)chris@crossjoin.co.uk Independent consultant specialising in Analysis Services and MDX: http://www.crossjoin.co.uk http://www.crossjoin.co.uk Blogger: http://cwebbbi.spaces.live.comhttp://cwebbbi.spaces.live.com Author: – MDX Solutions 2 nd Edition – Expert Cube Development With Analysis Services 2008

3 Agenda Designing for Performance How Analysis Services Answers Queries Partitions Aggregations Formula Engine bottlenecks Caching

4 Designing for Performance Some features give increased flexibility at a cost to performance – Eg parent/child hierarchies, referenced relationships, unnatural hierarchies, many-to-many, distinct count, semi-additive measures... Mistakes made in modelling the data warehouse are often ‘papered over’ using SSAS features – Eg calculations being done in MDX when they should have been done in the ETL Design decisions should always be made with query performance in mind

5 Attribute Relationships Attribute relationships describe the one-to- many relationships between attributes in your dimensions Will probably not be optimal after you’ve run the Dimension wizard – only knows about foreign key relationships Are very important for query performance

6 Parent/Child Hierarchies Parent/child hierarchies are very flexible, but can perform badly sometimes – Especially where the hierarchy contains thousands of members If you know the maximum depth of the hierarchy, use a ragged hierarchy instead – Although ragged hierarchies can also perform badly The ‘parent/child hierarchy naturaliser’ in BIDS Helper can do this automatically

7 SSAS Query Processing Formula Engine works out what data is needed for each query, and requests it from the Storage Engine handles retrieval of raw data from disk, and any aggregation required MDX Query In Cellset Out Query Subcube Requests Cache Disk

8 Tuning the Storage Engine Tuning the Storage Engine is all about: – Reducing the time taken to read data from disk – Reducing the time taken to aggregate that data to the required granularity As a result, disk and CPU have an important role to play in Storage Engine performance However, throwing hardware at the problem is rarely the whole answer

9 Partitions Measure groups can be divided up into multiple partitions – Enterprise Edition only You should partition your measure groups to reflect the slices that users are likely to use in their queries – Usually by the Time dimension Partitioning improves query performance in two ways: – It reduces overall IO because AS should only scan the partitions containing the data requested – It increases parallelism because AS can scan more than one partition simultaneously

10 Partitions You should be partitioning so that a single partition contains between 2 million rows and 30 million rows of data Although SSAS should auto-detect the slice of a partition, it is good practice to set the Slice property manually, even on MOLAP partitions In Profiler, Progress Report Begin/End events show which partitions are being read

11 Aggregations Aggregations are pre-calculated sets of summary values – Similar to what is returned by a GROUP BY query in SQL Building aggregations improves query performance by reducing the amount of calculation done at query time Aggregations start to show an effect on regular Sum measures where partition size is greater than a few million rows A Measure Group can have multiple Aggregation Designs A Partition can be associated with one Aggregation Design – aggregations are built on a per-partition basis

12 Aggregation Design Methodology 1.Make your dimension design as ‘clean’ as possible – delete or disable unnecessary attributes, set attribute relationships 2.Set the AggregationUsage property appropriately on each Cube dimension 3.Run the Aggregation Design Wizard 1.First using the ‘I Click Stop’ option 2.Then, if this creates too many aggregations, using ‘Performance Gain’ of 30%

13 Aggregation Design Methodology 4.Set up Query Logging and run Usage-Based Optimisation 5.If necessary, design aggregations manually using BIDS Helper or the Aggregation Design tab 6.Check aggregations are being used by looking for the Get Data From Aggregation event in Profiler

14 The Formula Engine If Storage Engine requests make up a negligible percentage of the time taken by the query, then the Formula Engine is the problem – Profiler and MDX Studio can be used to find out if this is the case – MDX Script Performance Analyser can help to work out which, if any, calculation on the cube is the problem Another giveaway sign is that when a query runs on a multi-processor server, only one CPU shows activity – The Formula Engine is single-threaded Rewriting MDX calculations can result in massive gains in performance

15 Writing Efficient MDX As with all code, using efficient algorithms is very important – It’s important to avoid doing expensive operations more than is necessary – Named sets can store the result of expensive set operations – Calculated members can cache the result of expensive numeric calculations In the implementation of these algorithms, it’s important to know which MDX functions perform best – MDX Studio can tell you which functions to avoid – Will vary by version and SP Calculations that execute in ‘bulk mode’ can perform many times better than those that execute in ‘cell-by-cell’ mode

16 Caching Caching can take place in the Storage Engine and the Formula Engine Caching is extremely important to query performance SE caching is very reliable, and caches can be shared between all users FE caching is much more temperamental...

17 Formula Engine Caching There are three different ‘scopes’ or lifetimes of a FE cache: – Query – for calculations defined in the WITH clause of a query, the FE values can only be cached for the lifetime of the query – Session – for calculations defined for a session, using the CREATE MEMBER statement executed on the client, FE values can only be cached for the lifetime of a session – Global – for calculations defined in the cube’s MDX Script, FE values can be cached usually until processing takes place Global scope is best from a performance point of view!

18 Restrictions on FE Caching Any calculations defined in the WITH clause of a query will force query scope for the FE cache As will the use of cell security As will the use of non-deterministic functions like NOW() With dimension security, calculation caches can only be shared between users who have the same permissions

19 Exploiting FE Caching Caching numeric values, so that expensive calculations are only performed once, can be very important for calculation performance Use of certain MDX functions and constructs can force Query scope for FE caching, and hurt performance

20 Cache Warming Data can be loaded into the SE cache by executing CREATE CACHE statements The FE cache can only be warmed by running MDX queries Cache warming involves automatically loading data into the cache to improve query performance for real users – SSIS packages can be used to do this

21 Thanks!

22 Links http://www.packtpub.com/article/query- performance-tuning-microsoft-analysis- services-part1 http://www.packtpub.com/article/query- performance-tuning-microsoft-analysis- services-part2 http://www.packtpub.com/article/query- performance-tuning-microsoft-analysis- services-part1 http://www.packtpub.com/article/query- performance-tuning-microsoft-analysis- services-part2


Download ppt "Performance Tuning Cubes and Queries in Analysis Services 2008 Chris Webb"

Similar presentations


Ads by Google