Presentation on theme: "Common Analysis Services Design Mistakes and How to Avoid Them"— Presentation transcript:
1Common Analysis Services Design Mistakes and How to Avoid Them Chris Webb
2Who Am I? Chris Webb email@example.com Independent Analysis Services and MDX consultant and trainerSQL Server MVPBlogger:
3Agenda Why good cube design is a Good Thing Using built-in best practices in BIDSETL in your DSVUser-unfriendly namesUnnecessary attributesParent/child painOne cube or many?Over-reliance on MDXUnused and/or unprocessed aggregations
4Why Good Design is Important! As if you needed reasons…?Good design = good performance = faster initial development = easy further development = simple maintenanceThis is not an exhaustive list, but a selection of design problems and mistakes I’ve seen on consultancy engagements
5Best Practices in BIDS Don’t ignore the blue squiggly lines in BIDS! They sometimes make useful recommendations about what you’re doingActively dismissing them, with comments, is a useful addition to documentationAs always, official ‘best practices’ aren’t always best practices in all situations
6Common Design Mistakes Three questions need to be asked:What’s the problem?What bad things will happen as a result?What can I do to fix it (especially after I’ve gone into production)?This is not a name-and-shame session!
7Problem: ETL in your DSV It’s very likely, when you are working in SSAS, that you need changes to the underlying relational structures and dataEg you need a new column in a tableYou then have two options:Go back to the relational database and/or ETL and make the changeHack something together in the DSV using named queries and named calculationsThe DSV is the easy option, but…
8Consequences: ETL in your DSV It could slow down processing performanceNo way to influence the SQL that SSAS generatesExpensive calculations/joins are better done once then persisted in the warehouse; you may need to process more than onceIt makes maintenance much harderDSV UI is not great for writing SQLYour DBA or warehouse developer certainly won’t be looking at it
9Fix: ETL in your DSV Bite the bullet and either: Do the necessary work in the underlying tables or ETL packagesCreate a layer of views instead of using named queries and calculationsUse the Replace Table With option to point the table in the DSV at your new view/tableNo impact on the rest of the cube!
10Problem: Unfriendly Names Cubes, dimensions and hierarchies need to have user-friendly namesHowever names are often user-unfriendlyUnchanged from what the wizard suggests, orUse some kind of database naming conventionDesigning a cube is like designing a UIWho wants a dimension called something like “Dim Product”….?
11Consequences: Unfriendly Names Unfriendly names put users off using the cubeThese are the names that users will see in their reports, so they must be ‘report ready’Users need to understand what they’re selectingAlso encourage users to export data out of cube to ‘fix’ the namesAnd so you end up with stale data, multiple versions of the truth etc etc etc
12Fix: Unfriendly Names You can rename objects easily, but: This can break calculations on the cubeIt can also break existing queries and reports, which will need rewriting/rebuildingIDs will not change, which makes working with XMLA confusingYou should agree the naming of objects with end users before you build them!
13Problem: Unnecessary Attributes Wizards often generate attributes on dimensions that users don’t want or needClassic example is an attribute built from a surrogate key columnWho wants to show a surrogate key in a report?
14Consequences: Unnecessary Attributes The more attributes you have:The more cluttered and less useable your UIThe slower your dimension processingThe harder it is to come up with an effective aggregation design
15Fix: Unnecessary Attributes Delete any attributes that your users will never useMerge attributes based on key and name columns into a single attributeSet AttributeHierarchyEnabled to false for ‘property’ attributes like addressesRemember that deleting attributes that are used in reports or calculations can cause more problems
16Problem: Parent Child Hierarchies Parent Child hierarchies are the only way to model hierarchies where you don’t know the number of levels in advanceThey are also very flexible, leading some people to use them more often than they should
17Consequences: Parent Child Parent Child hierarchies can lead to slow query performanceNo aggregations can be built at levels inside the hierarchySlow anywayThey can also be a nightmare forScoping advanced MDX calculationsDimension security
18Fix: Parent ChildIf you know, or can assume, the maximum depth of your hierarchy, there’s an alternativeNormal user hierarchies can be made ‘Ragged’ with the HideMemberIf propertyHides members if their parent has no name, or the same name as themStill has performance issues, but less than parent/childYou can use the BIDS Helper “parent/child naturaliser” to convert the underlying relational table to a level-based structure
19Problem: One Cube or Many? When you have multiple fact tables do you create:One cube with multiple measure groups?Multiple cubes with one measure group?Each has its own pros and cons that need to be understood
20Consequences: One Cube Monster cubes containing everything can be intimidating and confusing for usersAlso tricky to develop, maintain and testOften changing one thing breaks anotherMaking changes may take the whole cube offlineSecuring individual measure groups is a painIf there are few common dimensions between measure groups and many calculations, query performance can suffer
21Consequences: Multiple Cubes If you need to analyse data from many cubes in one query, options are very limitedA single cube is the only way to go if you do need to do thisEven if you don’t think you need to do it now, you probably will do in the future!
22Fix: One Cube to Multiple If you have Enterprise Edition, Perspectives can help overcome usability issuesLinked measure groups/dimensions can also be used to split out more cubes for security purposesIf you have one cube, you probably don’t want to split it up though
23Fix: Multiple Cubes to One Start again from scratch!LookUpCube() is really bad for performanceLinked measure groups and dimensions have their own problems:Duplicate MDX codeStructural changes require linked dimensions to be deleted and recreated
24Problem: Over-reliance on MDX As with the DSV, it can be tempting to use MDX calculations instead of making structural changes to cubes and dimensionsA simple example is to create a ‘grouping’ calculated member instead of creating a new attributeOther examples include pivoting measures into a dimension, or doing m2m in MDX
25Consequences: Over-reliance on MDX MDX should always be your last resort:Pure MDX calculations are always going to be the slowest option for query performanceThey are also the least-easily maintainable part of a cubeThe more complex calculations you have, the more difficult it is to make other calculations work
26Fix: Over-reliance on MDX Redesigning your cube is a radical option but can pay big dividends in terms of performanceRisks breaking existing reports and queries but your users may be ok with this to get more speed
27Problem: Unused Aggregations Aggregations are the most important SSAS feature for performanceMost people know they need to build some and run the Aggregation Design Wizard……but don’t know whether they’re being used or not
28Consequences: Unused Aggregations Slow queries!If you haven’t built the right aggregations, then your queries won’t get any performance benefitYou’ll waste time processing these aggregations, and waste disk space storing them
29Fix: Unused Aggregations Design some aggregations!Rerun the Aggregation Design Wizard and set the Aggregation Usage property appropriatelyPerform Usage-Based OptimisationDesign aggregations manually for queries that are still slow and could benefit from aggregations
30Problem: Unprocessed Aggregations Even if you’ve designed aggregations that are useful for your queries, you need to ensure they’re processedRunning a Process Update on a dimension will drop all Flexible aggregations
32Fix: Unprocessed Aggregations Run a Process Default or a Process Index on your cube after you have run a Process Update on any dimensionsNote that this will result in:Longer processing times overallMore disk space usedBut it will at least mean that your queries run faster
34Coming up…P/X001The Developer Side of the Microsoft Business Intelligence stackSascha LorenzP/L001Understanding SARGability (to make your queries run faster)Rob FarleyP/L002Notes from the field: High Performance storage for SQL ServerJustin LangfordP/L005Service Broker: Message in a bottleKlaus AschenbrennerP/T007Save the Pies for Lunch - Data Visualisation Techniques with SSRS 2008Tim Kent#SQLBITS