Presentation is loading. Please wait.

Presentation is loading. Please wait.

Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling.

Similar presentations


Presentation on theme: "Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling."— Presentation transcript:

1 Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling

2 Objectives At the end of this lesson, you will know :  Data Modeling for Data Warehouse  What are dimensions and facts  Star Schema and Snowflake Schemas  Coverage Tables  Factless Tables  What to look for in Modeling tools  Some modeling tools

3 Data Modeling for Data Warehouse How to structure the data in your data warehouse ? Process that produces abstract data models for one or more database components of the data warehouse Modeling for Warehouse is different from that for Operational database  Dimensional Modeling, Star Schema Modeling or Fact/Dimension Modeling

4 Modeling Techniques Entity-Relationship Modeling  Traditional modeling technique  Technique of choice for OLTP  Suited for corporate data warehouse Dimensional Modeling  Analyzing business measures in the specific business context  Helps visualize very abstract business questions  End users can easily understand and navigate the data structure

5 Entity-Relationship Modeling - Basic Concepts The ER modeling technique is a discipline used to illuminate the microscopic relationships among data elements. The highest art form of ER modeling is to remove all redundancy in the data. Created databases that cannot be queried !!!!!

6 An Order Processing ER Model Order Header Order Details Customer Table FK Item Table FK Salesrep table City Sales District Sales Region Sales Country Product Brand Product Category FK

7 Entity-Relationship Modeling - Basic Concepts Entity  Object that can be observed and classified by its properties and characteristics  Business definition with a clear boundary  Characterized by a noun  Example  Product  Employee

8 Entity-Relationship Modeling - Basic Concepts Relationship  Relationship between entities - structural interaction and association  described by a verb  Cardinality  1-1  1-M  M-M  Example : Books belong to Printed Media

9 Entity-Relationship Modeling - Basic Concepts Attributes  Characteristics and properties of entities  Example :  Book Id, Description, book category are attributes of entity “Book”  Attribute name should be unique and self-explanatory  Primary Key, Foreign Key, Constraints are defined on Attributes

10 Entity-Relationship Modeling – Why Not ? End users cannot understand or remember an ER model. No graphical user interface (GUI) that takes a general ER model and makes it usable by end users. Softwares cannot usefully query a general ER model. Use of the ER modeling technique defeats the basic allure of data warehousing, namely intuitive and high-performance retrieval of data.

11 Dimensional Modeling - Basic Concepts Represents the data in a standard, intuitive framework that allows for high-performance access; Schema designed to process large, complex, adhoc and data intensive queries. No concern for concurrency, locking and insert/update/delete performance Every dimensional model is composed of one table with a multipart key, called the fact table, and a set of smaller tables called dimension tables. This characteristic "star-like" structure is often called a star join.

12 CITY PRODUCT PERIOD CUSTOMER SALES AMOUNT UNITS Measures Dimension s REGION STATE DISTRICT CITY PRODUCT BRAND COLOR CATEGORY SIZE DAY MONTH YEAR QUARTER CUSTOMER CATEGORY CONTACT ADDRESS Star Schema

13 Dimensional Modeling - Basic Concepts Fact Tables  The most useful facts in a fact table are numeric and additive  Typically represents a business transaction, or event that can be used in analyzing business process  By nature fact tables are sparse  Usually very large - billions of records

14 Dimensional Modeling - Basic Concepts Dimension Tables  Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.  Dimension tables, most often contain descriptive textual information  Determine contextual background for facts  Examples :  Time  Location/Region  Customers

15 Dimensional Modeling - Basic Concepts Measures  A numeric attribute of a fact  Represents performance or behavior of the business relative to the dimensions  The actual numbers are called variables  Occupy very little space compared to Fact Tables  Examples :  Quantity supplied  Transaction amount  Sales volume

16 Fact Table & Dimension Tables Fact Tables Numerical Measurements of business are stored in Fact Tables. Dimensional Tables Dimensions are attributes about facts.

17 Conformed Dimensions Dimension that means the same thing with every possible fact table that it can be joined with Conformed dimensions most essential  For the Bus Architecture  Integrated function of the Data Warehouse Some common dimensions are :  Customer  Product  Location  Time

18 Surrogate Keys All tables (facts and dimensions) should not use production keys but Data Warehouse generated surrogate keys  Productions keys get reused sometimes  In case of mergers/acquisitions, protects you from different key formats  Production systems may change their systems to generalize key definitions  Using surrogate key will be faster  Can handle Slowly Changing dimensions well

19 Slowly Changing Dimensions Certain kinds of dimension attribute changes need to be handled differently in Data Warehouse Type I - Overwrite  e.g. Name Correction, Description changes Type II - Partition History  Packing change, Customer movement  Create a new dimension record with new surrogate key Type III - Organizational changes  Sales Force Reorganization  Show by sales broken by new and old organizations  Need to create an old and a new field

20 Factless Fact Tables For Event Tracking e.g. attendance Date_Key Student_Key Course_Key Teacher_Key Facility_Key Date Dimension Course Dimension Facility Dimension Student Dimension Teacher Dimension

21 Problem : To find out which Products on promotion did not sell? Date_Key Product_Key Store_Key Promotion_Key Dollars Sold Date Dimension Store Dimension Product Dimension Promotion Dimension Units Sold Fact Table Coverage Tables

22 Date_Key Product_Key Store_Key Promotion_Key Date Dimension Store Dimension Product Dimension Promotion Dimension Sales Promotion Coverage Table Coverage Tables Solution - Coverage Tables

23 Snowflake Schema Dimension tables are normalized by decomposing at the attribute level Each dimension has one key for each level of the dimension’s hierarchy Good performance when queries involve aggregation Complicated maintenance and metadata, explosion in number of table. Makes user representation more complex and intricate

24 Snowflake schema - Example Fact Table Dim Table Dim Table Dim Table Dim Table

25 Aggregates Pre-stored summaries in the database Significant Performance advantage Preferably should not be stored in fact tables. May take significant time to build aggregates Many tools can automatically navigate to most aggregated table that can service a query

26 Aggregate Navigators Automatically redirect queries to the most summarized table Some tools like Business Objects, Discoverer, Microstrategy, Metacube etc support this Native database support already available Aggregate Navigator DBMS LAN

27 Examples of Data Modeling Tools ERWIN  Supports Data Warehouse design as a modeling technique Powersoft WarehouseArchitect  Module of Power Designer specifically for DW Modeling Oracle Designer  Can be extended for Warehouse modeling Others like Infomodeler, Silverrun are also used

28 Questions


Download ppt "Last Updated : 26th may 2003 Center of Excellence Data Warehousing Introductionto Data Modeling."

Similar presentations


Ads by Google