Supervisor : Prof . Abbdolahzadeh

Slides:



Advertisements
Similar presentations
Supervisor : Prof . Abbdolahzadeh
Advertisements

1 Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this proposal or quotation. An Introduction to Data.
C6 Databases.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Data Warehousing M R BRAHMAM.
Database – Part 3 Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Mr. Sakthi Angappamudali.
Chapter 15 Data Warehousing, OLAP, and Data Mining
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
The University of Akron Dept of Business Technology Computer Information Systems Database Management Approaches 2440: 180 Database Concepts Instructor:
Chapter 13 The Data Warehouse
Data Warehouse Components
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Designing a Data Warehouse
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ETL By Dr. Gabriel.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Database Systems – Data Warehousing
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
OnLine Analytical Processing (OLAP)
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
1 Data Warehouses BUAD/American University Data Warehouses.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Data resource management
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Foundations of Business Intelligence: Databases and Information Management.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Advanced Database Concepts
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
An Overview of Data Warehousing and OLAP Technology
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 5: Data Warehousing.
Managing Data Resources File Organization and databases for business information systems.
Intro to MIS – MGS351 Databases and Data Warehouses
Advanced Applied IT for Business 2
Decision Support System by Simulation Model (Ajarn Chat Chuchuen)
Chapter 13 Business Intelligence and Data Warehouses
Chapter 13 The Data Warehouse
IBM DATASTAGE online Training at GoLogica
Data Warehouse.
Microsoft Business Intelligence
Databases and Data Warehouses Chapter 3
Chapter 13 – Data Warehousing
MANAGING DATA RESOURCES
Data Warehouse and OLAP
Unidad II Data Warehousing Interview Questions
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Data Warehouse.
Data Warehousing Concepts
Analytics, BI & Data Integration
Data Warehouse and OLAP
Data Warehousing.
Presentation transcript:

Supervisor : Prof . Abbdolahzadeh Amirkabir University Data Warehouse Design Architectures Morteza Zaker Supervisor : Prof . Abbdolahzadeh

Presentation plan Introduction Modeling issues Conclusions Data Warehouse Architecture Concepts of dimensional model History of Data Warehouse Modeling issues Conclusions

DW and OLAP – general concepts Data Warehouses – contain historical data for supporting decision-making process On-Line Analytical Processing systems - facilitate manipulation of DW data DW and OLAP require clear definition of facts, dimensions, and hierarchies DW logical level design based on star/snowflake schema

Data Warehouse Architecture Data Flow Architectures Single DDS NDS+DDS ODS+DDS System Architecture Federated Architectures ETL Architectures

Extract and transform and Load (ETL) then bring data from various source system into a stage area. ETL Integrate and transform stage’s data then load it to dimensional data store (DDS). when loading data into DDS ,(DQETL) do various rules to check data then bad data push into DQ data base for reporting and correcting .Bad data Automatically be corrected or tolerated if it can be needed. User Can get data via several front-end tools and applications. Some Application operate on Multidimensional format. So data in the DDS is Loaded to Multidimensional database( MDB ). Multidimensional is a form of database that data is stored like a cub. Cells of cube represent number of variable which is called Dimensions. Value of dimension show when and where business event happened. ETL system is managed by Control system based on the rules in the metadata. Metadata is a database that contain information about the data structure data usage-quality rules and other information about data. Audit system is used for understanding what happen during ETL process and then logs system oprenation into Metadata database. The Data is examined to realize characteristic of data by data Profiler. Data profiler analyze data to find out that for example how many row does Table has? And which one is Null and so on. Spreadsheet Source system are OLTP system that contain data which is loaded to DW . OLTP : Capture and store Business Transaction online. Pivot tables Ad Hoc query Metadata Data Profiler Control system+Audit reports Source System ETL stage DDS DQ ETL MDB analytic reports DQ correction Data Mining Other BI Application reports

Single DDS Extract data from several source system Push it in stage area. Stage area could be a database or files system. Stage is necessary because of lacking memory space and so on . Application Metadata Control system+Audit s1 DDS Source System ETL+DQ Advantage of Single DDS is simple to design , because the data from the stage is loaded straight into the dimensional data store, without going to any kind of normalized store. It is good for system which has just one source or just has one dimension. The main ِDisadvantage is that it is more difficult, in this architecture, to create a second DDS. The DDS in the single DDS architecture is the master data store. s2 Stage MDB Control system + Audit manage ETL system concurrently . Log ETL process to Metadata file or database Metadata contain Data Structure and data processing within data warehouse Second ETL package pick up data from Stage and Integrates them. Apply some Data Quality rules Puts consolidated data into a DDS Application

NDS + DDS Data storage = Stage, NDS & DDS Control system+Audit Metadata Application Data storage = Stage, NDS & DDS Core DW Store = Normalized & Dimensional Format Data Marts = 1 to N Data Marts in each DDS ETL Engine = 4 ETL Package NDS Contain Master table and transaction Table Master Table  Dimensions in DDS Transaction table  Facts in DDS s1 DDS s2 NDS-ETL +DQ NDS DDS-ETL DDS s2 Stage Application MDB NDS is the in front of DDS and NDS is our master data .Master data contain all historical nad structral data . DDS is our Transactional data and just could contain Single years of data . Application

ODS is hybrid data store so User can access data from ODS ODS + DDS We have got Data storage = Stage, ODS & DDS Core DW Store = Normalized & Dimensional Format Data Marts = 1 to N Data Marts in each DDS ETL Engine = 4 ETL Package ODS Contain Master table and transaction Table but it is not Master data store Master Table  Dimensions in DDS Transaction table  Facts in DDS The advantage of this architecture is that The third normal form is slimmer than the NDS because it contains only current values. In this architecture we have a central place to integrate, maintain, and publish master data. The normalized relational store is updatable by the user application. The main ِDisadvantage is that it is more difficult, in this architecture, to create a second DDS. The DDS in the single DDS architecture is the master data store. Control system+Audit Metadata Application s1 DDS s2 ODS-ETL +DQ ODS DDS-ETL DDS s2 Stage Application MDB Application ODS is hybrid data store so User can access data from ODS

Federated DW FDW FDW FDW Application Application Application ETL EII DM3 DW2 DW1 DW2 DW3 DM2 DW1 DM1 Relational DW EII(Extract Information Integration) is a method to integrate data by accessing different source systems online and aggregating the outputs on the fly before bringing the end result to the user. All 3 DWs must be standardized as the same structure. Third normalized DW Data marts in the same Data warehouse is nonintegrated data marts. They can be dimensional, normalized, or neither Dimensional DW The FDW ETL needs to match the Updating time frequency of the source DWs. The FDW ETL needs to integrate the data from source DWs based on business rules. Duplicate records need to be merged. Subject area in here is very narrow that the source DWs.

System Architecture

ETL Architectures

There are two different types of database software Main Issues that must be considered There are two different types of database software Symmetric multi processing(SMP) It is a databas system that runs on one or more machines with several identical processors sharing the same disk storage. The database is physically located in a single disk storage system. Examples of SMP database systems are SQL Server, Oracle, DB/2, Informix, and Sybase 2. Massively parallel processing (MPP) It is a database system that 20uns on more than one machine where each machine has its own disk storage. The database is physically located in several disk storage systems that are interconnected to each other. An Examples of MPP database systems are Teradata, Neoview, Netezza. MPP database system is faster and more scalable than an SMP database system. In an MPP database system, a table is physically located in several nodes, each with its own storage.

Research challenges (1) Spatial measure aggregations considering Their types Distributive – reuse of aggregates, e.g., spatial union Algebraic – additional treatments for reusing aggregates, e.g., center of n points Holistic - new calculation with a row data, e.g., equi-partition Topological relationships between hierarchy levels Types of hierarchies