Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementing a Data Warehouse with SQL Server Jump Start

Similar presentations


Presentation on theme: "Implementing a Data Warehouse with SQL Server Jump Start"— Presentation transcript:

1 Implementing a Data Warehouse with SQL Server Jump Start
Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great Lakes

2 Meet Richard Currey Senior Technical Trainer – New Horizons United
Focused on database and development technologies MCDBA, MCITP Dev / Admin / BI, MCSE Data Platform, BI MCSD, MCPD Web Developer, ASP .NET Developer, Windows Developer 28 Years Industry Experience Designed, developed and managed BI-centric projects at several Fortune 500 organizations Extensive consulting and project management background

3 Meet George SQUILLACE 28 Years Industry Experience
Senior Technical Trainer – New Horizons Great Lakes (20 Years) “SQL” is in his name! Focused on database technologies MCT Since 1997 28 Years Industry Experience SQL Server 2012: certified as MCSA, MCSE: Data Platform, & MSCE: Business Intelligence Certified in every version of SQL Server since SQL 2000 Certified in every version of Windows Server from NT through Windows Server 2008 Certified in Exchange Server 5.5 through Exchange Server

4 Course Modules Implementing a Data Warehouse with SQL Server
01 | Design and Implement Dimensions and Fact Tables 04 | Control Flow 02 | Data Flow - Extract Data 05 | Configure and Deploy SSIS 03 | Data Flow - Transform Data 06 | Manage Enterprise Data

5 Setting Expectations Target Audience
Data warehousing specialists who want to expand their knowledge of SQL Server Integration Services (SSIS) Database professionals who want to take exam and get certified in data warehouse implementations Suggested Prerequisites/Supporting Material SQL Server development experience and exposure to extract, transform, and load (ETL) processes Course 10777, Implementing a Data Warehouse with Microsoft SQL Server 2012 MS Press Book: Training Kit (Exam ): Implementing a Data Warehouse with Microsoft SQL Server 2012

6 01 | Design and Implement Dimensions and Fact Tables
Richard Currey | Senior Technical Trainer–New Horizons United George Squillace | Senior Technical Trainer–New Horizons Great Lakes

7 Module 1 Overview Schema Design: Star vs. Snowflake
Facts and Fact Tables Fact and Dimension Granularity Conformed and Non-Conformed Dimensions Time Dimensions

8 Topic: Schema Design: Star vs. Snowflake

9 Topic: Schema Design: Star versus Snowflake
Star Schema Snowflake Schema Processing and Performance Considerations

10 Star Schema A star schema has a single table for each dimension
DimSalesPerson SalesPersonKey SalesPersonName StoreName StoreCity StoreRegion DimProduct ProductKey ProductName ProductLine SupplierName DimCustomer CustomerKey CustomerName City Region FactOrders ShippingAgentKey TimeKey OrderNo LineItemNo Quantity Revenue Cost Profit DimDate DateKey Year Quarter Month Day DimShippingAgent ShippingAgentName A star schema has a single table for each dimension Each table supports all attributes for that dimension Typically a de-normalized solution

11 Implementing a Star Schema

12 Snowflake Schema More normalized solution
DimSalesPerson SalesPersonKey SalesPersonName StoreKey DimProduct ProductKey ProductName ProductLineKey SupplierKey DimCustomer CustomerKey CustomerName GeographyKey FactOrders ShippingAgentKey TimeKey OrderNo LineItemNo Quantity Revenue Cost Profit DimDate DateKey Year Quarter Month Day DimShippingAgent ShippingAgentName DimProductLine ProductLineName DimGeography City Region DimSupplier SupplierName DimStore StoreName More normalized solution Typically contains multiple tables per dimension Each table contains dimension key, value, and the foreign key value for the parent

13 Implementing a Snowflake Schema

14 Processing and Performance Considerations
Star schema requires de-normalization during the load process Can impact the ETL times Snowflake schema can increase dimension complexity Can impact Analysis Services solutions, negatively affecting cube performance

15 Topic: Facts and Fact Tables

16 Topic: Facts and Fact Tables
What Is a Fact? Grouping Facts What Is Granularity? Design Considerations

17 What Is a Fact? Facts are the key metrics used to measure business results: Sales Production Inventory Can be additive, semi-additive, or non-additive

18 Grouping Facts Facts are grouped into fact tables
Related facts should be in the same fact table Facts with different granularity should be in different tables

19 What Is Granularity? Granularity refers to the level of detail in which facts are recorded Facts can be at different levels of granularity

20 Design Considerations
Fact tables should have all keys relating to dimensions Primary key should be composite of all dimension keys Separate additive, semi-additive, and non-additive facts

21 Topic: Fact and Dimension Granularity

22 Topic: Fact and Dimension Granularity
How to Determine Fact Granularity Dimension Granularity

23 How to Determine Fact Granularity
Granularity is determined based on business needs Should be the lowest level of detail that needs to be examined If data from transactional systems has more detail than needed for analysis, ETL should aggregate the details

24 Dimension Granularity
Dimension granularity needs to be matched with fact granularity Each dimension has its own granularity Fact tables are keyed to the granularity of the dimensions

25 Topic: Conformed and Non-Conformed Dimensions

26 Topic: Conformed and Non-Conformed Dimensions
What Are Conformed and Non-Conformed Dimensions? Shared and Degenerate Dimensions What Is a Slowly Changing Dimension?

27 What Are Conformed and Non-Conformed Dimensions?
Shared by multiple fact tables Used when all business users have the same definitions for the dimension Non-conformed dimension Dimension table targeted to a single fact table Used when dimensions have different definitions for different business units

28 Creating Conformed and Non-Conformed Dimensions

29 Shared and Degenerate Dimensions
Shared dimension Used by multiple facts Dimension key is stored in the fact table Dimension value is stored in the dimension table with other attributes of that dimension Degenerate dimension Used by a single fact table Dimension value is stored directly in the fact table No corresponding dimension table

30 What Is a Slowly Changing Dimension?
When the historical attribute values are retained if the attributes are updated Used when the organization does not want to lose track of what actually happened Example: customer moves from Connecticut to Seattle Slowly changing dimension types: Type 1: Attribute history is not retained Type 2: Attribute change creates a new record Type 3: Original attribute value recorded and latest value recorded with an effective date

31 Implementing a Slowly Changing Dimension

32 Topic: Time Dimensions

33 Topic: Time Dimensions
Types of Time Dimensions Time Dimensions and Hierarchies

34 Types of Time Dimensions
Based on standard calendar breakdowns Year => Month => Day Year => Quarter => Week => Day Based on fiscal calendar Year => Fiscal Quarter => Fiscal Month => Fiscal Week => Day Time dimension needs to contain all hierarchy elements to the lowest granularity for the fact tables

35 Time Dimensions and Hierarchies
Establishes the “buckets” that the business uses Typically there are multiple hierarchies in the dimension Calendar Business Created using Microsoft Excel, scripts, or are auto-generated

36 Creating a Time Dimension

37


Download ppt "Implementing a Data Warehouse with SQL Server Jump Start"

Similar presentations


Ads by Google