Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISAM 5931: Data Warehousing & Data Mining Group Project submitted by : Mudassar Hakim & Gaurav Wadhwani.

Similar presentations


Presentation on theme: "ISAM 5931: Data Warehousing & Data Mining Group Project submitted by : Mudassar Hakim & Gaurav Wadhwani."— Presentation transcript:

1 ISAM 5931: Data Warehousing & Data Mining Group Project submitted by : Mudassar Hakim & Gaurav Wadhwani

2 Objective To develop a small scale data warehouse using MS SQL server To develop a small scale data warehouse using MS SQL server Learn and use the MS SQL analysis services. Learn and use the MS SQL analysis services. Grasp the concept of Multidimensional cubes in business analysis Grasp the concept of Multidimensional cubes in business analysis To get good grades To get good grades

3 Database and Tools Database for a retail business will be used as sample data base Database for a retail business will be used as sample data base MS Access is used as RDBMS MS Access is used as RDBMS Most of the analysis will be performed using Analysis Services Most of the analysis will be performed using Analysis Services

4 Star Schema To develop the data warehouse we will use Star Schema To develop the data warehouse we will use Star Schema Star schema is a most Common Type of Dimensional Model Star schema is a most Common Type of Dimensional Model It consists of various Dimension tables and typically one Fact Table It consists of various Dimension tables and typically one Fact Table Fact table is surrounded by Various Dimensions tables Fact table is surrounded by Various Dimensions tables

5 Star Schema

6 Why Star Schema? Star Schema is used because of its Simplicity Star Schema is used because of its Simplicity Star Schema is does produce some inconsistency and redundancy Star Schema is does produce some inconsistency and redundancy Having a small database, we sacrifice efficiency over convenience Having a small database, we sacrifice efficiency over convenience

7 Concept of Dimensions & facts Dimensions could be thought of as factors which influence a transaction or which are influenced by a transaction. Dimensions could be thought of as factors which influence a transaction or which are influenced by a transaction. e.g. Sale of a product at certain store could be influenced by the products available, customer need and also by the time of sale. e.g. Sale of a product at certain store could be influenced by the products available, customer need and also by the time of sale. Dimension tables are shallow but have many attributes. Dimension tables are shallow but have many attributes. More the number of attributes in dimension tables better could be the quality of analysis. More the number of attributes in dimension tables better could be the quality of analysis. Fact is a record of a transaction from history. Fact is a record of a transaction from history. It cannot be modified or changed being a part of history. It cannot be modified or changed being a part of history. Facts are analyzed on the bases of Dimensions to track history trends. Facts are analyzed on the bases of Dimensions to track history trends. History trends help us find better path for future. History trends help us find better path for future.

8 Dimensions Product: ( Product_ID, Brand_Name, Product_Name, Price, Category_ID, Category Subcatergory_ID, Subcategory,Category_Manager, Catergory_ID) Product: ( Product_ID, Brand_Name, Product_Name, Price, Category_ID, Category Subcatergory_ID, Subcategory,Category_Manager, Catergory_ID) Time:(Date_Id, Day) Time:(Date_Id, Day) Customer: (Customer_ID, Customer_Name, City, State, Region, Country) Customer: (Customer_ID, Customer_Name, City, State, Region, Country) Supplier: (Supplier_ID, Supplier_Class, Supplier_Name, Supplier_Contact) Supplier: (Supplier_ID, Supplier_Class, Supplier_Name, Supplier_Contact) Cost: ( Item_ID, Item_cost, Item_Inventory) Cost: ( Item_ID, Item_cost, Item_Inventory)

9 Dimensions

10 Dimension Hierarchy We use dimension hierarchy to browse at different level of measures. We use dimension hierarchy to browse at different level of measures. In our case dimension hierarchy is usable in different dimensions including, time, location and product category. In our case dimension hierarchy is usable in different dimensions including, time, location and product category.

11 Dimension Hierarchy

12 Sales Facts Since we are dealing with a retail business, their prime fact would be the sales. Since we are dealing with a retail business, their prime fact would be the sales. It has small number of attributes but its very deep as compared to dimension table. It has small number of attributes but its very deep as compared to dimension table. SalesFact: (Date_ID, Customer_ID, Product_ID, Employee_ID, Dollar_Sales, Sales_Units) SalesFact: (Date_ID, Customer_ID, Product_ID, Employee_ID, Dollar_Sales, Sales_Units)

13 Fact Table

14 Relationships Dimension tables and Fact table are linked through foreign keys Dimension tables and Fact table are linked through foreign keys Primary key for fact table consists of concatenated key composed of various FKs from dimension tables. Primary key for fact table consists of concatenated key composed of various FKs from dimension tables.

15 Relationships

16 New Dimensions We have try to induce two more dimensions to the existing model We have try to induce two more dimensions to the existing model 1.Cost 2.Supplier Cost will help us determine the profit margin for different products Cost will help us determine the profit margin for different products Supplier dimension could be significant if we want to analyze the sales of the products by different suppliers. Supplier dimension could be significant if we want to analyze the sales of the products by different suppliers.

17


Download ppt "ISAM 5931: Data Warehousing & Data Mining Group Project submitted by : Mudassar Hakim & Gaurav Wadhwani."

Similar presentations


Ads by Google