Presentation is loading. Please wait.

Presentation is loading. Please wait.

Populating Data Warehouse Structures Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema.

Similar presentations


Presentation on theme: "Populating Data Warehouse Structures Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema."— Presentation transcript:

1

2 Populating Data Warehouse Structures

3 Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema

4 Implementing the Star Schema 1. Extract Data From Multiple Sources 2. Integrate, Transform, and Restructure Data 3. Load Data Into Dimension Tables and Fact Tables

5 The Star Schema Data Load NorthwindOLTP Staging Area Polaris Data Warehouse Heterogeneous Data Sources ExternalFiles ExternalFiles InternalFiles InventoryStar SalesStar Extracting Data From Transforming Loading the Heterogeneous Sources Data Star Schema DTS Financial

6 Verifying the Dimension Source Data Verifying Accuracy of Source Data Integrating data from multiple sources Applying business rules Checking structural requirements Managing Invalid Data Rejecting invalid data Saving invalid data to a log Correcting Invalid Data Transforming data Reassigning data values

7 Dimension Data Load Examples:buyer_namebuyer_name Barr, Adam Chai, Sean OMelia, Erin... reg_idreg_id buyer_firstbuyer_first Adam Sean Erin... buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id DTS buyer_codebuyer_code A123 B buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id buyer_codebuyer_code U999 A123 B buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id buyer_namebuyer_name Barr, Adam Chai, Sean Smith, Jane Paper, Anne reg_idreg_id DTS buyer_namebuyer_name Barr, Adam Chai, Sean reg_idreg_id II IV buyer_namebuyer_name Smith, Jane Paper, Anne reg_idreg_id

8 Maintaining Integrity of the Dimension Assigning a Surrogate Key to Each Record Defines the dimensions primary key Relates to the foreign key fields of the fact table Loading One Record Per Application Key Maintains uniqueness in the dimension Depends on how you manage changing dimension data Maintains integrity of the fact table

9 Managing Changing Dimension Data Dimensions with Changing Column Values Inserts of new data Updates of existing data Slowly-Changing Dimension Design Solutions Type 1: Overwrite the dimension record Type 2: Write another dimension record Type 3: Add attributes to the dimension record

10 Type 1: Overwriting the Dimension Slide Existing record is changed product key product name product size product package product dept product cat product subcat... product key product name product size product package product dept product cat product subcat... Product Dimension 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks... Before After 001 Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks oz.

11 Type 2: Writing Another Dimension Record Adds a new record product key product name product size product package product dept product cat product subcat effective_date … product key product name product size product package product dept product cat product subcat effective_date … Product Dimension 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks Before After 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks oz. 12 oz. Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks

12 Type 3: Adding Attributes in the Dimension Record Additional information is stored in an existing record Product Dimension product key product name product size product package product dept product cat product subcat current product size date previous product size previous product size date 2nd previous product size 2nd previous product size date... product key product name product size product package product dept product cat product subcat current product size date previous product size previous product size date 2nd previous product size 2nd previous product size date... product size previous product size previous product size date Before 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks Oz (null) Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks Oz (null) oz. 11 oz After 001 Rice Puffs 12 oz. Bag Grocery Dry Goods Snacks oz Oz Rice Puffs 12 oz. Bag Grocery Dry Goods Snacks oz Oz oz oz

13 Verifying the Fact Table Source Data Verifying Accuracy of Source Data Integrating data from multiple sources Applying business rules Checking structural requirements Managing Invalid Data Rejecting invalid data Saving invalid data to a log Correcting Invalid Data Transforming data Reassigning data values

14 Assigning Foreign Keys Dimension Tables customer_dimcustomer_dim 201 ALFI Alfreds product_dimproduct_dim Chai Source Data customer id ALFI1231/1/ /1/2000 time_dimtime_dim product id order date quantity_sales amount_sales 10, /1/ ,789 cust_key 1231/1/ prod_key time_key quantity_sales amount_sales , Sales Fact Data

15 Defining Measures Loading Measures from the Source System Calculating Additional Measures Source System Data Fact Table Datacustomer_idcustomer_id VINET ALFI HANAR... product_idproduct_id 9GZ 1KJ 0ZA... priceprice qtyqty customer_keycustomer_key product_keyproduct_key qtyqty total_salestotal_sales

16 Maintaining Data Integrity Adhering to the Fact Table Grain A fact table can only have one grain You must load a fact table with data at the same level of detail as defined by the grain Enforcing Column Constraints NOT NULL constraints FOREIGN KEY constraints

17 Implementing Staging Tables Centralize and Integrate Source Data Break Up Complex Data Transformations Facilitate Error Recovery Staging Area sales_stage inventory_stage market_stage shipments_stage

18 DTS Functionality Accessing Heterogeneous Data Sources Importing, Exporting, and Transforming Data Creating Reusable Transformations and Functions Automating Data Loads Managing Metadata Customizing and Extending Functionality

19 Defining DTS Packages Identifies Data Sources and Destinations Defines Tasks or Actions Implements Transformation Logic Defines Order of Operations

20 Identifying Package Components Connections Access Data Sources and Destinations Tasks Describe Data Transformations or Functions Steps Define the Order of Task Operations or Workflow Global Variables Store Data that Can Be Shared Across Tasks

21 Creating Packages Using the DTS Import / Export Wizard Perform ad-hoc table and data transfers Develop a prototype package Using DTS Package Designer Edit packages created with the DTS Import/Export Wizard Create packages with a wide range of functionality Programming DTS Applications Directly access the functionality of the DTS Object Model Requires Microsoft Visual Basic or Microsoft Visual C++

22 Using DTS to Populate the Sales Star Populating the Sales Star Dimensions Populating the Sales Star Fact Table

23 Populating the Sales Star Dimensions Product Tab Delimited Files NorthwindOLTP DTS time_dim customer_dim product_dim SQL Server Stored Procedure DTS

24 Populating the Sales Star Fact Table DTS sales_fact DTS sales_stage time_dimcustomer_dim product_dimsales_stage Sales Data File

25 Designing Modular Packages Creating Modular Packages Simplify complex workflows Create more readable packages Produce smaller packages that are easier to debug Using Outer Packages Execute multiple packages within a single package Combine modular packages into logical workflows Reuse modular packages in different workflows Execute packages in parallel

26 Using DTS to Populate the Sales Star


Download ppt "Populating Data Warehouse Structures Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema."

Similar presentations


Ads by Google