Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )

Slides:



Advertisements
Similar presentations
Chapter 13 The Data Warehouse
Advertisements

C6 Databases.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Data Warehousing Concepts Transparencies
1 Minggu 13, Pertemuan 25 Data Warehousing and Data Mining Concepts (Ch , 30.5, ; 3rd ed.) Matakuliah: T0206-Sistem Basisdata.
Designing the Data Warehouse and Data Mart Methodologies and Techniques.
Components and Architecture CS 543 – Data Warehousing.
Data Warehousing Dale-Marie Wilson, Ph.D..
13 Chapter 13 The Data Warehouse Hachim Haddouti.
Introduction to Data Warehousing Enrico Franconi CS 636.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Designing a Data Warehouse
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Lecture-8/ T. Nouf Almujally
Business Intelligence Instructor: Bajuna Salehe Web:
Data Warehousing Concepts
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Data Warehouse & Data Mining
Understanding Data Warehousing
Database Systems – Data Warehousing
Data Warehouse Concepts Transparencies
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
AN OVERVIEW OF DATA WAREHOUSING
Datawarehouse Objectives
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Case 2: Emerson and Sanofi Data stewards seek data conformity
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane.
Data Warehouses and OLAP Data Management Dennis Volemi D61/70384/2009 Judy Mwangoe D61/73260/2009 Jeremy Ndirangu D61/75216/2009.
CISB594 – Business Intelligence Data Warehousing Part I.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
DATA RESOURCE MANAGEMENT
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Data Warehousing: Architecture, Components and The Building Blocks
Advanced Database Concepts
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Business intelligence systems. Data warehousing. An orderly and accessible repositery of known facts and related data used as a basis for making better.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Intro to MIS – MGS351 Databases and Data Warehouses
Manajemen Data (2) PTI Pertemuan 6.
Chapter 13 The Data Warehouse
Data Warehouse.
Databases and Data Warehouses Chapter 3
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Introduction to Data Warehousing
Data Warehouse.
Data Warehousing Concepts
CSD305 Data Warehouse Design
Presentation transcript:

Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )

 Outline Data Warehousing Warehouse Architecture Its components Data flows Data marts Benefits of data warehousing Disadvantages of datawarehousing Case Study

 What is data warehousing? data warehousing is subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management’s decision-making process. a data warehouse is data management and data analysis data webhouse is a distributed data warehouse that is implement over the web with no central data repository goal: is to integrate enterprise wide corporate data into a single reository from which users can easily run queries

 What is data warehousing? Subject-oriented  WH is organized around the major subjects of the enterprise..rather than the major application areas.. This is reflected in the need to store decision-support data rather than application-oriented data Integrated  because the source data come together from different enterprise- wide applications systems. The source data is often inconsistent using..The integrated data source must be made consistent to present a unified view of the data to the users Time-variant  the source data in the WH is only accurate and valid at some point in time or over some time interval. The time-variance of the data warehouse is also shown in the extended time that the data is held, the implicit or explicit association of time with all data, and the fact that the data represents a series of snapshots Non-volatile  data is not update in real time but is refresh from OS on a regular basis. New data is always added as a supplement to DB, rather than replacement. The DB continually absorbs this new data, incrementally integrating it with previous data

Operational data source1  The architecture Query Manager Warehouse Manager DBMS Operational data source 2 Meta-data High summarized data Detailed data Lightly summarized data Operational data store (ods) Operational data source n Archive/backup data Load Manager Data mining OLAP(online analytical processing) tools Reporting, query, application development, and EIS(executive information system) tools End-user access tools Typical architecture of a data warehouse Operational data store (ODS)

 The main components Operational data sources  The sources of data for the data warehouse is supplied from: The data from the mainframe systems in the traditional network and hierarchical format. Data can also come from the relational DBMS like Oracle, Informix. In addition to these internal data, operational data also includes external data obtained from commercial databases and databases associated with supplier and customers. Operational datastore(ODS)  is a repository of current and integrated operational data used for analysis. It is often structured and supplied with data in the same way as the data warehouse, but may in fact simply act as a staging area for data to be moved into the warehouse

 The main components Load manager  also called the frontend component, it performs all the operations associated with the extraction and loading of data into the warehouse. These operations include simple transformations of the data to prepare the data for entry into the warehouse Warehouse manager  performs all the operations associated with the management of the data in the warehouse. The operations performed by warehouse manager include: Analysis of data to ensure consistency Transformation and merging the source data from temporary storage into data warehouse tables Create indexes and views on the base table. Generation of aggregation Backing up and archiving of data

 The main components Query manager  also called backend component, it performs all the operations associated with the management of user queries. The operations performed by this component include directing queries to the appropriate tables and scheduling the execution of queries Detailed, lightly and lightly summarized data,archive/backup data Meta-data End-user access tools  can be categorized into five main groups: data reporting and query tools, application development tools, executive information system (EIS) tools, online analytical processing (OLAP) tools, and data mining tools

 Data flows Inflow- The processes associated with the extraction, cleansing, and loading of the data from the source systems into the data warehouse. upflow- The process associated with adding value to the data in the warehouse through summarizing, packaging, and distribution of the data downflow- The processes associated with archiving and backing-up of data in the warehouse outflow- The process associated with making the data availabe to the end-users Meta-flow- The processes associated with the management of the meta-data

Operational data source1 Warehouse Manager DBMS Meta-data High summarized data Detailed data Lightly summarized data Operational data store (ods) Operational data source n Archive/backup data Load Manager Data mining tools OLAP (online analytical processing) tools End-user access tools Information flows of a data warehouse Reporting, query,application development, and EIS (executive information system) tools Downflow Inflow Meta-flow Upflow Query Manager Outflow Warehouse Manager

 Data mart data mart  a subset of a data warehouse that supports the requirements of particular department or business function The characteristics that differentiate data marts and data warehouses include: a data mart focuses on only the requirements of users associated with one department or business function. data marts do not normally contain detailed operational data, unlike data warehouses as data marts contain less data compared with data warehouses, data marts are more easily understood and navigated.

Operational data source1 Warehouse Manager DBMS Operational data source 2 Meta-data High summarized data Detailed data Lightly summarized data Operational data store (ods) Operational data source n Archive/backup data Load Manager Data mining OLAP(online analytical processing) tools Reporting, query,application development, and EIS(executive information system) tools End-user access tools Typical data warehouse adn data mart architecture Operational data store (ODS) Query Manage summarized data(Relational database) Summarized data (Multi-dimension database) Data Mart (First Tier) (Third Tier) (Second Tier) Warehouse Manager

Reasons for creating a data mart To give users access to the data they need to analyze most often To provide data in a form that matches the collective view of the data by a group of users in a department or business function To improve end-user response time due to the reduction in the volume of data to be accessed To provide appropriately structured data as ditated by the requirements of end-user access tools Normally use less data so tasks such as data cleansing, loading, transformation, and integration are far easier, and hence implementing and setting up a data mart is simpler than establishing a corporate data warehouse

The cost of implementing data marts is normally less than that required to establish a data warehouse The potential users of a data mart are more clearly defined and can be more easily targeted to obtain support for a data mart project rather than a corporate data warehouse project

 The benefits of data warehousing The potential benefits of data warehousing are high returns on investment. substantial competitive advantage. increased productivity of corporate decision-makers. Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals.

Disadvantages of warehousing Data warehouses are not the optimal environment for unstructured data.unstructured data Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data.latency Over their life, data warehouses can have high costs. Maintenance costs are high. Data warehouses can get outdated relatively quickly. There is a cost of delivering suboptimal information to the organization. There is often a fine line between data warehouses and operational systems. Duplicate, expensive functionality may be developed. Or, functionality may be developed in the data warehouse that, in retrospect, should have been developed in the operational systems and vice versa.

TOSHIBA Case study