Presentation on theme: "Bus Matrix… the foundation of your Data Warehouse"— Presentation transcript:
1 Bus Matrix… the foundation of your Data Warehouse The Bus Matrix is the cornerstone of a successful Dimensional Data Warehouse strategy. It serves many purposes: from communicating requirements, capabilities, and expectations with the business users down to the prioritization and delegation of tasks across the development team. Join me in this session and learn what a Bus Matrix is, why it is the single most important document in your Data Warehouse project, and what can go wrong without it. We'll also cover several approaches for creating and maintaining the Bus Matrix document.Bill Anton is an independent consultant whose primary focus is designing and developing Data Warehouses and Business Intelligence solutions using the Microsoft BI stack. When he's not working with clients to solve their data-related challenges, he can usually be found answering questions on the MSDN forums, attending PASS meetings, or writing blog entries over atBill AntonPrime Data Intelligence
2 About Me I Love Data! …also, Microsoft DW/BI (MCTS/MCITP, MCSA/MCSE) Independent Prime Data Intelligence, LLCAtlanta BI SQL Server Users GroupTwitter: @SQLbyoBIBlog:
3 What we will cover today Dimensional Modeling 101What, Why, HowCommon ChallengesBus MatrixWhat is it?How does it help?ExamplesLinks on resource page of blog
4 What is Dimensional Modeling? Factsadditive amountsE.g. Sales amount, inventory quantitySUM, AVERAGE, MAX, MIN, COUNTDimensionsdescriptive attributesE.g. Date, Product, Location, CustomerGROUP BY <attribute>, <attribute>, etc
5 What is Dimensional Modeling? Each fact forms the center of a starEx. this customer, bought this product on this date….“Star Schema”
6 What is Dimensional Modeling? Denormalization“Repeating Values”Opposite of “normalized” (e.g. 3rd Normal Form)Optimized for reads (not writes)
7 Dimensional Modeling 101Question: What are the most common types of Data Warehouse methodologies/architectures?KimballInmonData VaultKimball: star-schema, conformed dimensionsInmon: 3NF data warehouse, Coporate Information Factory, Hub-n-SpokeData Vault: highly controversial…I love it.Is Anyone Familiar with Data Vault?
8 All of them Dimensional Modeling 101 Question: For which of these DW methodologies should you include a dimensional model? Kimball, Inmon, Data VaultAll of them With Kimball it is built in.With Inmon/DataVault you need to add a dimensional layer.
11 Inmon 3NF EDW + Data Mart(s) Does anyone know what “EDW” stands for?How is that different from “Kimball” method?
12 Data Vault + Data Mart(s) time-invariant system of record (S-O-R)Copies of source systemsBusiness rules applied downstreamHubs/links/satellitesdan linstedtData Vault Method: Since no business rule is applied to the data on the way into the EDW, the EDW becomes a ‘statement of fact’. This single version of facts now becomes a time-invariant system of record (S-O-R) where the facts as they were known to the business at any point in history are stored. In our example above, the EDW stores both the addresses, that borrower has.
13 Why Dimensional Modeling Intuitive to Business UsersSimpler than OLTP/3NFRise of Self-Service (E.g. Power Pivot, Power View)Iterative Development“Agile”PerformanceOptimized for analytical queries e.g. sales amount by product in 2013 for top 10 all-time customersAnd many more…See Teo Lachev’s “WHY SEMANTIC LAYER” newsletter:With Kimball it is built inWith Inmon/DataVault you need to add a dimensional layerSelf-Service BI = implies more and more users will be looking for access to data (think: PowerPivot)“Semantic Layer”
14 Intuitive to Business Users Business users think in terms of Dimensions and Fact whether they know it or not.Teach a Business User how to use a pivot table…
16 Do we sell more bikes to single or married females?
17 What was our most/least profitable product this year?
18 What was the Average Monthly Gross Margin Return on Inventory Investment (GMROII) by Product Category for the trailing 6 months?It’s ComplicatedWe’ll come back to this…
19 Star-Schema Each fact forms the center of a star Ex. this customer, bought this product on this date….
20 1 “Star” per Fact table Sales Process Inventory Process Each fact forms the center of a starSales ProcessInventory Process
21 Facts are related through dimensions… But there are usually dimensions in commonGMROII (Gross Margin Return on Inventory Investment)Sales ProcessInventory Process
22 Facts are related through dimensions… “Conformed Dimensions” A conformed dimension is a set of data attributes that have been physically referenced by multiple fact tables using the same key value to refer to the same structure, attributes, domain values, definitions and concepts.Dimensions are conformed when they are either exactly the same (including keys) or one is a perfect subset of the other.Dimension tables are NOT conformed if the attributes are labeled differently or contain different values.MUST BE IDENTICALBy linking facts through conformed dimensions, cross-process analysis is possibleWikipedia (http://en.wikipedia.org/wiki/Dimension_(data_warehouse)#Conformed_dimension)
24 Revisiting Average Monthly Gross Margin Return on Inventory Investment (GMROII) By linking facts through conformed dimensions, cross-process analysis is possibleSales + Inventory GMROIIAverage Monthly GMROIIProfit for total time periodSum of each month ending inventory cost
25 What was the Average Monthly Gross Margin Return on Inventory Investment (GMROII) by Product Category for the trailing 6 months?This becomes easy…* As long as users understand the dimensions in common
26 Where things start to get complex… 1 Star per Fact tableMultiple Fact tables per business processMultiple business processes in an enterpriseModeling business processes
27 Dimensional Model becomes a “Galaxy of Stars” FinanceProductionSalesDistributionModeling business processesHR
28 ER Diagram: Adventure Works Sample DW Common Method for Visually Documenting a DatabaseNot so bad…Just a sample – not realisticBusiness Users don’t understandSimpler than Normalized data model
29 For bigger Data Warehouses… Multiple fact tables per business processSpecialty fact tables for certain metricsInventory Control ProcessWork OrdersPurchasing/ProcurementModel to the metricsSQL Bits: Data Modeling for Analysis Services CubesAlex WhittlesThis ^^Turns into this ^^
30 Variety of Problems to Overcome with Dimensional Modeling Communication & StrategyWhat’s the short term plan of attack?What’s the long term plan of attack?DocumentationWhat’s in our Data Warehouse?Business Users can’t read ER diagramsBusiness Users are typically only familiar with a 1 or 2 business processesE.g. Sales User vs Inventory User; Warehouse Supervisor vs CEOConforming Dimensions is hard…REALLY hardSo are changes (E.g. Impact Analysis)POABoil the ocean vs IterativeConforming Dimensionspolitical battletakes time (e.g. MDM)business needs to understand the importance of it
31 What’s the Solution? What about a Bus Matrix? Train business users to read ER Diagrams?Simplify Data Model?Ignore certain business processes?Don’t use Conformed Dimensions?Force business users to manually map data between processes?What about a Bus Matrix?How do events in one business process impact/correlate with other Business ProcessesNot a single answer – but a Bus Matrix can address most of the problems
32 What is a Bus Matrix?2-dimensional visualization showing the intersection of facts and dimensionsIn the most basic formUnderstandable by business users
33 Variety of Use-Cases for a Bus Matrix Documentation, Communication, TrainingFacilitate User Adoption of BI toolsCommunicate Expectations w/ BusinessNew users unfamiliar with new business processTeam DevelopmentAgilePrioritization of TasksDivide & ConquerRoad-MappingPrioritization of Business Processes in a Business Intelligence “Program”Documentation:Facilitate User Adoption of BI toolsCommunicate Expectations w/ BusinessNew users unfamiliar with new business processTeam Development:Agile, Prioritization of TasksRoadmapping Tool:Prioritization of Business Processes in Business Intelligence “Program”How many people understand the difference between a BI project and BI program?
34 Documentation For Business Easier than an ER diagramIn Business TermsUsers can now understand what dimensions to slice/dice measure groups from a specific business processShows how difference business processes are related
35 Documentation for IT Adding context to relationships Differentiating between role playing dimension and base dimensionIndicating type of Dimension, type of Fact, and grain
37 Team Development Sprint 1 Internet Sales Sprint 2 Reseller Sales Break dimensions/facts into separate tasks (database, ETL, cube)Prioritize Dimensions before Fact TablesNext sprint, only create the missing dimensions
38 Road-Mapping Quadrant: Business Value vs Implementation Complexity Bus Matrix VersionContext for delivery datesSource Systems contain data for multiple business processes
39 When To Create a Bus Matrix During Requirements GatheringBefore You Start Development!Updated Over TimeChanges to Business ProcessesNew Source Systems (E.g. mergers/acquisitions)
40 How To Create a Bus Matrix Manual via ExcelAutomated via SSRS
41 Manual Only option when starting out ;-) Updates can be made quickly made as requirements come inAdds development overhead, but the ROI is well worth it
42 AutomatedReporting pack with drill-through to data dictionary informationCan be based on Cube or Relational Database (*FK required)Incorporate query statistics to visualize common usage patternsUse MDS to allow SME’s to manage business definitionsBased on example from Alex WhittlesBased on example from Alex Whittles