Designing a Data Warehouse

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management
Supervisor : Prof . Abbdolahzadeh
C6 Databases.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Database Systems: Design, Implementation, and Management Tenth Edition
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
Database Management: Getting Data Together Chapter 14.
Components and Architecture CS 543 – Data Warehousing.
DATA WAREHOUSING.
Chapter 14 The Second Component: The Database.
13 Chapter 13 The Data Warehouse Hachim Haddouti.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Data Warehousing Building a database to support the decision making activities of a department or business unit.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ITEC 3220A Using and Designing Database Systems
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Data Warehouse & Data Mining
Understanding Data Warehousing
Database Systems – Data Warehousing
DECISION SUPPORT SYSTEM ARCHITECTURE: The data management component.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
AN OVERVIEW OF DATA WAREHOUSING
Datawarehouse Objectives
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
CISB594 – Business Intelligence
1 Topics about Data Warehouses What is a data warehouse? How does a data warehouse differ from a transaction processing database? What are the characteristics.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
DATA RESOURCE MANAGEMENT
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Supervisor : Prof . Abbdolahzadeh
Advanced Applied IT for Business 2
Decision Support System by Simulation Model (Ajarn Chat Chuchuen)
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Data Warehouse and OLAP
An Introduction to Data Warehousing
Dimensional Modeling.
Introduction of Week 9 Return assignment 5-2
Data Warehousing Concepts
Data Warehouse and OLAP
Data Warehouse and OLAP Technology
Presentation transcript:

Designing a Data Warehouse Issues in DW design

Data Warehouse A read-only database for decision analysis Subject Oriented Integrated Time variant Nonvolatile consisting of time stamped operational and external data.

Data Warehouse vs Operational Databases Highly tuned Real time Data Detailed records Current values Accesses small amounts of data in a predictable manner Flexible access Consistent timing Summarized as appropriate Historical Access large amounts of data in unexpected ways

Data Warehouse Purpose Identify problems in time to avoid them Locate opportunities you might otherwise miss

“DNA of Data Warehousing”, Techguide.com,

Data Warehouse: New Approach An old idea with a new interest because of: Cheap Computing Power Special Purpose Hardware New Data Structures Intelligent Software

Warehousing Problems Business Issues Data Quantity Data Accuracy Maintenance Ownership Cost

Warehousing Problems Business Issues Database Issues DBMS Software Technology Complexity

Warehousing Problems Business Issues Data Issues Analysis Issues User Interface Intelligent Processing

Three Approaches Classical Enterprise Database Contains operational data from all areas of the organization. Data Mart Extracted and managerial support data designed for departmental or EUC applications Data Package Data required for a specific application

Classical Warehouse

Mart

Package

Three Fundamental Processes Data Acquisition Data Storage Data a Access

Data Acquisition Handles acquisition of data from legacy systems and outside sources. Data is identified, copied, formatted and prepared for loading into the warehouse. Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Acquisition steps Catalog the data Clean and prepare the data. Develop an inventory of where it is and what it means. Clean and prepare the data. Extract from legacy files and reformat to make it usable. Transport data from one location to another. Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Storage The storage component holds the data so that the many different data mining, executive information and decision support systems can make use of it effectively. Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

The Storage Area Managed by Relational databases Specialized hardware like those from Oracle Corp. or Informix Software Inc. Specialized hardware symmetric multiprocessor (SMP) or massively parallel processor (MPP) machines Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Storage The majority of warehouse storage today is being managed by relational databases running on Unix platforms. Oracle, Sybase Inc., IBM Corp. and Informix control 65 percent of the warehouse storage market. Meta Group Inc. (1996) Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Access Different end-user PCs and workstations draw data from the warehouse with the help of multidimensional analysis products, neural networks, data discovery tools or analysis tools. These powerful, "smart" software products are the real driving force behind the viability of data warehousing. Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Access Tools Intelligent Agents and Agencies Query Facilities and Managed Query Environments Statistical Analysis Data Discovery. (decision support, artificial intelligence and expert systems) OLAP Data Visualization Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Hardware Budget A typical startup warehouse project allocates more than 60 percent of its budget for hardware and software to the creation of a powerful storage component, spending just 30 percent on data mining and user access technologies. Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Systems Analysis Budget Budgeting for systems analysis and development, however, follows a very different pattern. More than 50 percent of development dollars are spent on building acquisition capabilities, 30 percent fund the development of user solutions and 20 percent are dedicated to the creation of databases in the storage component. Warehousing Wherewithal, Rob Mattison, CIO, April 1996.

Design Issues Relational and Multidimensional Models Denormalized and indexed relational models more flexible Multidimensional models simpler to use and more efficient

Star Schemas in a RDBMS In most companies doing ROLAP, the DBAs have created countless indexes and summary tables in order to avoid I/O-intensive table scans against large fact tables. As the indexes and summary tables proliferate in order to optimize performance for the known queries and aggregations that the users perform, the build times and disk space needed to create them has grown enormously, often requiring more time than is allotted and more space than the original data!

Building a Data Warehouse from a Normalized Database The steps Develop a normalized entity-relationship business model of the data warehouse. Translate this into a dimensional model. This step reflects the information and analytical characteristics of the data warehouse. Translate this into the physical model. This reflects the changes necessary to reach the stated performance objectives. Designing the Perfect Data Warehouse (the paper formerly known as: Data Modeling for Data Warehouses), Frank McGuff , http://members.aol.com/fmcguff/dwmodel/

The Business Model Identify the data structure, attributes and constraints for the client’s data warehousing environment. Stable Optimized for update Flexible

Business Model As always in life, there are some disadvantages to 3NF: Performance can be truly awful. Most of the work that is performed on denormalizing a data model is an attempt to reach performance objectives. The structure can be overwhelmingly complex. We may wind up creating many small relations which the user might think of as a single relation or group of data.

Structural Dimensions The first step is the development of the structural dimensions. This step corresponds very closely to what we normally do in a relational database. The star architecture that we will develop here depends upon taking the central intersection entities as the fact tables and building the foreign key => primary key relations as dimensions.

Simple DW pattern.

Other Dimensions Categorical dimensions: generated groups (additional key components) Partitioning dimensions: subtypes (planned vs. actual) Informational dimensions: generate different types of data (messy).