Data Warehouse.

Slides:



Advertisements
Similar presentations
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Advertisements

IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Chapter 13 The Data Warehouse
DATA WAREHOUSE (Muscat, Oman).
Data Warehouse Concepts & Architecture.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Components of the Data Warehouse Michael A. Fudge, Jr.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Defining Data Warehouse Concepts and Terminology.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Database Systems – Data Warehousing
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
CISB594 – Business Intelligence
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane.
CISB594 – Business Intelligence Data Warehousing Part I.
Data Warehouses and OLAP Data Management Dennis Volemi D61/70384/2009 Judy Mwangoe D61/73260/2009 Jeremy Ndirangu D61/75216/2009.
CISB594 – Business Intelligence Data Warehousing Part I.
 Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures (high level).  Describe the processes used.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
What is OLAP?.
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
By: Haytham Abdel-Qader. Topics in Data Management include: I. Data analysis II. Database management system III. Data modeling IV. Database administration.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Business Intelligence Overview
Data Warehouse and OLAP
Data Warehousing Data warehousing provides architectures & tools for business executives to systematically organize, understand & use their data to make.
Advanced Applied IT for Business 2
Defining Data Warehouse Concepts and Terminology
Data warehouse.
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse—Subject‐Oriented
Three tier Architecture of Data Warehousing
Data Warehousing and Data Mining By N.Gopinath AP/CSE
Data Warehouse.
Online Analytical Processing OLAP
Defining Data Warehouse Concepts and Terminology
Basic Concepts in Data Management
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse and OLAP
Types of OLAP Servers.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Introduction to Data Warehousing
Data Warehouse A place the information system department puts the data that is turned into information. Data must be properly prepared,organized,and presented.
Data Warehousing: Data Models and OLAP operations
Data Warehouse and OLAP
Data warehouse.
Data Warehousing Data Model –Part 1
Mary Ledbetter, Systems Sales Engineer
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehousing Concepts
DATABASE TECHNOLOGIES
Data Warehouse and OLAP
Data Warehouse and OLAP Technology
Presentation transcript:

Data Warehouse

Introduction DW stores large volume of data which was used by DSS. DW is maintained separately from organization’s operational database. Transaction Database OLTP System Write optimized Recent data System meant to support for decision is called as OLAP System. Read optimized -Historic data

DW are relatively static with only infrequent updates. DW is stand-alone repository of i/f, integrated from several, possibly heterogeneous operational database. It is the enabling technique which facilitates improved business decision-making. CUSTOMER DB SALES DB DW MARKETING FINANCIAL ANALYSIS

DEFINITON A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.  Subject-oriented  Data that gives information about a particular subject instead of about a company's ongoing operations. customer sales production

Integrated  Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole. Operational database Datawarehouse Saving account Account Loan account Current account

Designated Time Frame (3 - 10 Years Key Includes Date Time-variant All data in the data warehouse is identified with a particular time period. Operational system Datawarehouse Designated Time Frame (3 - 10 Years Key Includes Date View of The Business Today Key Need Not Have Date

Non-volatile  Data in the data warehouse are never over-written or deleted — once committed, the data are static, read-only, and retained for future reporting. “CRUD” Actions Operational System Read Insert Update Replace Create Delete No Data Update Data Warehouse Load Read

Data Warehouse Concepts Data Warehouse Environment Architecture Contains Integrated Data From Multiple Legacy Applications A/P Update Data Mart Integration Criteria Insert O/P Load Read Data Mart ODS Pay Replace Delete All Or Part Of System of Record Data Mktg HR Data Warehouse Load Criteria Data Mart Loads A/R D/W Load D/W Best System of Record Data Read

NEED FOR DW Difficulty in obtaining Data Integration Data,Information Information structure is not able to provides full and dynamic analysis of information available. Inconsistent results obtained from queries & reports arising from heterogeneous data store. Increased difficulty in delivering consistent, comprehensive information in a timely fashion. DW holds historical data in transaction system for long period of time could also interfere with their performance. Batter performance of query response time in DW.

Data Warehouse Architecture

Single-Layer Architecture There is no physical data warehouse or data mart between the operation data and the analytic tools. The middleware in this type of system should be considered a virtual data warehouse, which consists of a software layer and not a data based layer. The single-layer model is light weight as it minimizes redundancies and thereby the amount of data stored. The analysis are based directly on the operational data.

Single-Layer Architecture DB DB Operational system Analytical Tools DB

Two-layer architecture The two-layer model consists of operational (and external) data in the source layer and a data warehouse layer on top of these. Between the source layer and the data warehouse layer is an ETL system. The analytical part of this architecture bases its analysis on the loaded data in the data warehouse or possibly data marts. The data warehouse layer furthermore adds the possibility to structure data in a way that fits with the multidimensional model of analytical tools, which in turn make them faster. Such an architecture is, however, more resource consuming to build and maintain.

Two-Layer Architecture Data Warehouse Operational data ETL Operational data Source Layer Analytical Tools External data External data

Three-layer architecture The three-layer architecture consists of the source layer (containing multiple source systems), the reconciled layer and the data warehouse layer (containing both data warehouses and data marts). The reconciled layer sits between the source data and data warehouse. It is populated with data from the source systems through an ETL process and the data stored in it is published further through another ETL process. In the reconciled layer the data has been cleaned up once and integrated to a common standardized form from multiple different source systems. The ETL process that feeds the data warehouse then only gets already integrated data that has less need for transformation. This architecture is especially useful for the very large, enterprise-wide systems.

Three-layer architecture

Operational Data Warehouse An ODS is an integrated, subject- oriented, volatile (including update), current-valued, enterprise-wide, detailed DB structure designed to serve operational users as they do high performance integrated processing. It serves as staging area for loading data into Enterprise DW.

Design of a Data Warehouse To design an effective data warehouse one needs to understand and analyze business needs, and construct a business analysis framework. Four different views regarding the design of a data warehouse must be considered: top-down view, the data source view, the data warehouse view & the business query view.

Bottom Tier: The bottom tier is a ware- house database server which is almost always a relational database system. Back-end tools and utilities are used to feed data into the bottom tier from operational database or external sources. These tools and utilities perform data extraction, cleaning and transformation

Middle Tier: It is an OLAP server which is typically implemented using either 1. A Relational OLAP (ROLAP) model, i.e., an extended relational DBMS that maps operations on multidimensional data to standard relational operations; 2. A Multidimensional OLAP (MOLAP) model, i.e., a special purpose server that directly implements multidimensional data and operations. Top tier: The top tier is a client, which contains query and reporting tools, analysis tools, and/or data mining tools (e.g., trend analysis, prediction, and so on).

Data Warehouse Models Enterprise warehouse: Data mart: An enterprise warehouse collects all of the information about subjects spanning the entire organization. It provides corporate-wide data integration, usually from one or more operational systems or external information providers, and is cross-functional in scope Data mart: A data mart contains a subset of corporate-wide data that is of value to a specific group of users. The scope is confined to specific, selected subjects. For example, a marketing data mart may confine its subjects to customer, item, and sales.

Depending on the source of data, data marts can be categorized into the following two classes: Independent data marts are sourced from data captured from one or more operational systems or external information providers, or from data generated locally within a particular department or geographic area. Dependent data marts are sourced directly from enterprise data warehouses. Virtual warehouse: A virtual warehouse is a set of views over operational databases. For efficient query processing, only some of the possible summary views may be materialized.

OLAP server architectures Relational OLAP (ROLAP) servers: These are the intermediate servers that stand in between a relational back-end server and client front-end tools. They use a relational or extended-relational DBMS to store and manage warehouse data, and OLAP middleware to support missing pieces. It has greater scalability. Multidimensional OLAP (MOLAP) servers: These servers support multidimensional views of data through array-based multidimensional storage engines. They map multidimensional views directly to data cube array structures. For example, Essbase of Arbor is a MOLAP server. The advantage of using a data cube is that it allows fast indexing to pre computed summarized data.

Hybrid OLAP (HOLAP) servers: The hybrid OLAP approach combines ROLAP and MOLAP technology, benefiting from the greater scalability of ROLAP and the faster computation of MOLAP. For example, a HOLAP server may allow large volumes of detail data to be stored in a relational database, while aggregations are kept in a separate MOLAP store. The Microsoft SQL Server 7.0 OLAP Services supports a hybrid OLAP server.